Novel crispr enzymes and systems

ABSTRACT

The present disclosure provides for systems, methods, and compositions for targeting nucleic acids. In particular, the invention provides mutated Cas13 proteins and their use in modifying target sequences as well as mutated Cas13 nucleic acid sequences and vectors encoding mutated Cas13 proteins and vector systems or CRISPR-Cas13 systems.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.62/712,809, filed Jul. 31, 2018, U.S. Provisional Application No.62/751,421, filed Oct. 26, 2018, U.S. Provisional Application No.62/775,865, filed Dec. 5, 2018, U.S. Provisional Application No.62/822,639, filed Mar. 22, 2019, and U.S. Provisional Application No.62/873,031, filed Jul. 11, 2019. The entire contents of theabove-identified applications are hereby fully incorporated herein byreference.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

This invention was made with government support under Grant Nos.HG009761, MH110049 and HL141201 awarded by the National Institutes ofHealth. The government has certain rights in the invention.

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (“BROD-2660WP_ST25.txt”;Size is 1,997,857 bytes and it was created on Jul. 25, 2019) is hereinincorporated by reference in its entirety.

TECHNICAL FIELD

The present invention generally relates to systems, methods andcompositions used for the control of gene expression involving sequencetargeting, such as perturbation of gene transcripts or nucleic acidediting, that may use vector systems related to Clustered RegularlyInterspaced Short Palindromic Repeats (CRISPR) and components thereof.

BACKGROUND

The CRISPR-CRISPR associated (Cas) systems of bacterial and archaealadaptive immunity are some such systems that show extreme diversity ofprotein composition and genomic loci architecture. The CRISPR-Cas systemloci have more than 50 gene families and there is no strictly universalgenes indicating fast evolution and extreme diversity of lociarchitecture. So far, adopting a multi-pronged approach, there iscomprehensive cas gene identification of about 395 profiles for 93 Casproteins. Classification includes signature gene profiles plussignatures of locus architecture. A new classification of CRISPR-Cassystems is proposed in which these systems are broadly divided into twoclasses, Class 1 with multisubunit effector complexes and Class 2 withsingle-subunit effector modules exemplified by the Cas9 protein. Noveleffector proteins associated with Class 2 CRISPR-Cas systems may bedeveloped as powerful genome engineering tools and the prediction ofputative novel effector proteins and their engineering and optimizationis important. Novel Cas13b orthologues and uses thereof are desirable.

Following the demonstration that CRISPR-Cas9 could be repurposed forgenome editing, interest in leveraging CRISPR systems lead to thediscovery of several new Cas enzymes and CRISPR systems with novelproperties (1-3). Notable amongst these new discoveries are the Class 2type VI CRISPR-Cas13 systems, which use a single enzyme to target RNAusing a programmable CRISPR-RNA (crRNA) guide (1-6). Cas13 binding totarget single-stranded RNA activates a general RNase activity thatcleaves the target and degrades surrounding RNA non-specifically (4).Type VI systems have been used for RNA knockdown, transcript labeling,RNA editing, and ultra-sensitive virus detection (3, 4, 7-12).CRISPR-Cas13 systems are further divided into four subtypes based on theidentity of the Cas13 protein (Cas13a-d) (2). All Cas13 protein familymembers contain two Higher Eukaryotes and Prokaryotes Nucleotide-binding(HEPN) domains. Citation or identification of any document in thisapplication is not an admission that such document is available as priorart to the present invention.

There exists a pressing need for alternative and robust systems andtechniques for targeting nucleic acids or polynucleotides (e.g. DNA orRNA or any hybrid or derivative thereof) with a wide array ofapplications, in particular development of effector proteins having analtered functionality, such as including, but not limited to increasedor decreased specificity, increased or decreased activity, alteredspecificity and/or activity, alternative PAM recognition, etc. Thisinvention addresses this need and provides related advantages. Addingthe novel RNA-targeting systems of the present application to therepertoire of genomic, transcriptomic, and epigenomic targetingtechnologies may transform the study and perturbation or editing ofspecific target sites through direct detection, analysis andmanipulation. To utilize the RNA-targeting systems of the presentapplication effectively for RNA targeting without deleterious effects,it is critical to understand aspects of engineering and optimization ofthese RNA targeting tools.

SUMMARY

In one aspect, the present disclosure provides an engineered CRISPR-Casprotein comprising one or more HEPN domains and further comprising oneor more modified amino acids, wherein the amino acids: interact with aguide RNA that forms a complex with the engineered CRISPR-Cas protein;are in a HEPN active site, an inter-domain linker domain, a lid domain,a helical domain 1, a helical domain 2, or a bridge helix domain of theengineered CRISPR-Cas protein; or a combination thereof.

In some embodiments, the HEPN domain comprises RxxxxH motif. In someembodiments, the RxxxxH motif comprises a R{N/H/K}X₁X₂X₃H (SEQ ID NO:78)sequence. In some embodiments, in the R{N/H/K}X₁X₂X₃H sequence, X₁ is R,S, D, E, Q, N, G, or Y, X₂ is independently I, S, T, V, or L, and X₃ isindependently L, F, N, Y, V, I, S, D, E, or A.

In some embodiments, the CRISPR-Cas protein is a Type VI CRISPR Casprotein. In some embodiments, the Type VI CRISPR Cas protein is Cas13.In some embodiments, the Type VI CRISPR Cas protein is a Cas13a, aCas13b, a Cas13c, or a Cas13d.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):T405, H407, K457, H500, K570, K590, N634, R638, N652, N653, K655, S658,K741, K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193,R600, K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835,K836, R838, R618, D434, K431, R53, K943, R1041, Y164, R285, R287, K292,E296, N297, Q646, N647, R402, K393, N653, N652, R482, N480, D396, E397,D398, E399, K294, E400, R56, N157, H161, H452, N455, K484, N486, G566,H567, A656, V795, A796, W842, K871, E873, R874, R1068, N1069, or H1073.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):H407, K457, H500, K570, K590, N634, R638, N652, N653, K655, S658, K741,K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193, R600,K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835, K836,R838, R618, D434, K431, R53, K943, R1041, Y164, R285, R287, K292, E296,N297, Q646, N647, R402, K393, N653, N652, R482, N480, D396, E397, D398,E399, K294, E400, R56, N157, H161, H452, N455, K484, N486, G566, H567,W842, K871, E873, R874, R1068, N1069, H1073.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):T405, H407, K457, H500, K570, K590, N634, R638, N652, N653, K655, S658,K741, K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193,R600, K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835,K836, R838, R618, D434, K431, R53, K943, R1041, Y164, R285, R287, K292,E296, N297, Q646, N647, R402, K393, N653, N652, R482, N480, D396, E397,D398, E399, K294, or E400.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):K393, R402, N482, T405, H407, S658, N653, A656, K655, N652, H567, N455,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, V795, A796,R791, G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, K741,R56, N157, H161, R1068, N1069, or H1073.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of PbCas13b: K393, R402, N482, H407, S658,N653, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, R791, G566, K590, R638, H452, S757, N756, N486, K484,N480, K457, K741, R56, N157, H161, R1068, N1069, or H1073.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of PbCas13b: W842, K846, K870, E873, orR877. In some embodiments, in helical domain 1 one or more mutation ofan amino acid corresponding to the following amino acids in helicaldomain 1 of PbCas13b: W842, K846, K870, E873, or R877. In someembodiments, in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPbCas13b: W842, K846, K870, E873, or R877. In some embodiments, in thebridge helix domain one or more mutation of an amino acid correspondingto the following amino acids in the bridge helix domain of PbCas13b:W842, K846, K870, E873, or R877. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: K393, R402, N480, N482, N652, or N653. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of PbCas13b: K393, R402, N480, or N482. In some embodiments,in the LID domain one or more mutation of an amino acid corresponding tothe following amino acids in the LID domain of PbCas13b: K393, R402,N480, or N482. In some embodiments, one or more mutation of an aminoacid corresponding to the following amino acids of PbCas13b: N652 orN653. In some embodiments, in helical domain 2 one or more mutation ofan amino acid corresponding to the following amino acids in helicaldomain 2 of PbCas13b: N652 or N653. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: T405, H407, S658, N653, A656, K655, N652, H567, N455, H500,K871, K857, K870, W842, E873, R877, K846, R874, R762, V795, A796, R791,G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, or K741. Insome embodiments, one or more mutation of an amino acid corresponding tothe following amino acids of PbCas13b: H407, S658, N653, K655, N652,H567, N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762,R791, G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, orK741. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: S658, N653,A656, K655, N652, H567, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, V795, A796, R791, G566, K590, R638, S757, N756, or K741. Insome embodiments, in a helical domain one or more mutation of an aminoacid corresponding to the following amino acids in a helical domain ofPbCas13b: S658, N653, A656, K655, N652, H567, H500, K871, K857, K870,W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590, R638,S757, N756, or K741. In some embodiments, one or more mutation of anamino acid corresponding to the following amino acids of PbCas13b: H567,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, V795, A796,R791, G566, S757, or N756.

In some embodiments, in helical domain 1 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1 of PbCas13b: H567, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, V795, A796, R791, G566, S757, or N756. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of PbCas13b: H567, H500, R762, V795, A796, R791, G566, S757,or N756. In some embodiments, in helical domain 1 one or more mutationof an amino acid corresponding to the following amino acids in helicaldomain 1 of PbCas13b: H567, H500, R762, V795, A796, R791, G566, S757, orN756. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: K871, K857,K870, W842, E873, R877, K846, or R874. In some embodiments, in thebridge helix domain one or more mutation of an amino acid correspondingto the following amino acids in the bridge helix domain of PbCas13b:K871, K857, K870, W842, E873, R877, K846, or R874. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of PbCas13b: H567, H500, or G566.

In some embodiments, in helical domain 1-2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1-2 of PbCas13b: H567, H500, or G566. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: K871, K857, K870, W842, E873, R877, K846, R874, R762, V795,A796, R791, S757, or N756. In some embodiments, in helical domain 1-3one or more mutation of an amino acid corresponding to the followingamino acids in helical domain 1-3 of PbCas13b: K871, K857, K870, W842,E873, R877, K846, R874, R762, V795, A796, R791, S757, or N756. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R762, V795, A796, R791, S757, orN756. In some embodiments, in helical domain 1-3 one or more mutation ofan amino acid corresponding to the following amino acids in helicaldomain 1-3 of PbCas13b: R762, V795, A796, R791, S757, or N756. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, A656, K655, N652, K590,R638, or K741. In some embodiments, in helical domain 2 one or moremutation of an amino acid corresponding to the following amino acids inhelical domain 2 of PbCas13b: S658, N653, A656, K655, N652, K590, R638,or K741. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: T405, H407,N486, K484, N480, H452, N455, or K457. In some embodiments, in the LIDdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in the LID domain of PbCas13b: T405, H407, N486,K484, N480, H452, N455, or K457. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: S658, N653, K655, N652, H567, H500, K871, K857, K870, W842,E873, R877, K846, R874, R762, R791, G566, K590, R638, S757, N756, orK741. In some embodiments, in a helical domain one or more mutation ofan amino acid corresponding to the following amino acids in a helicaldomain of PbCas13b: S658, N653, K655, N652, H567, H500, K871, K857,K870, W842, E873, R877, K846, R874, R762, R791, G566, K590, R638, S757,N756, or K741. In some embodiments, one or more mutation of an aminoacid corresponding to the following amino acids of PbCas13b: H567, H500,K871, K857, K870, W842, E873, R877, K846, R874, R762, R791, G566, S757,or N756. In some embodiments, in helical domain 1 one or more mutationof an amino acid corresponding to the following amino acids in helicaldomain 1 of PbCas13b: H567, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, R791, G566, S757, or N756. In some embodiments, one ormore mutation of an amino acid corresponding to the following aminoacids of PbCas13b: H567, H500, R762, R791, G566, S757, or N756. In someembodiments, in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: H567, H500, R762, R791, G566, S757, or N756. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K871, K857, K870, W842, E873, R877,K846, R874, R762, R791, S757, or N756. In some embodiments, in helicaldomain 1-3 one or more mutation of an amino acid corresponding to thefollowing amino acids in helical domain 1-3 of PbCas13b: K871, K857,K870, W842, E873, R877, K846, R874, R762, R791, S757, or N756.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of PbCas13b: R762, R791, S757, or N756. Insome embodiments, in helical domain 1-3 one or more mutation of an aminoacid corresponding to the following amino acids in helical domain 1-3 ofPbCas13b: R762, R791, S757, or N756. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: S658, N653, K655, N652, K590, R638, or K741. In someembodiments, in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPbCas13b: S658, N653, K655, N652, K590, R638, or K741. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H407, N486, K484, N480, H452, N455,or K457.

In some embodiments, in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPbCas13b: H407, N486, K484, N480, H452, N455, or K457. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R56, N157, H161, R1068, N1069, orH1073. In some embodiments, in a HEPN domain one or more mutation of anamino acid corresponding to the following amino acids in a HEPN domainof PbCas13b: R56, N157, H161, R1068, N1069, or H1073. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R56, N157, or H161. In someembodiments, in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 of PbCas13b:R56, N157, or H161. In some embodiments, one or more mutation of anamino acid corresponding to the following amino acids of PbCas13b:R1068, N1069, or H1073. In some embodiments, in HEPN domain 2 one ormore mutation of an amino acid corresponding to the following aminoacids in HEPN domain 2 of PbCas13b: R1068, N1069, or H1073. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K393, R402, N482, T405, H407, N486,K484, N480, H452, N455, or K457. In some embodiments, in the LID domainone or more mutation of an amino acid corresponding to the followingamino acids in the LID domain of PbCas13b: K393, R402, N482, T405, H407,N486, K484, N480, H452, N455, or K457. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: K393, R402, N482, H407, N486, K484, N480, H452, N455, or K457.In some embodiments, in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N482, H407, N486, K484, N480, H452, N455, or K457.In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of PbCas13b: T405, H407, S658, N653, A656,K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, V795, A796, R791, G566, K590, R638, H452, S757, N756, N486,K484, N480, K457, K741, K393, R402, or N482. In some embodiments, one ormore mutation of an amino acid corresponding to the following aminoacids of PbCas13b: H407, S658, N653, K655, N652, H567, N455, H500, K871,K857, K870, W842, E873, R877, K846, R874, R762, R791, G566, K590, R638,H452, S757, N756, N486, K484, N480, K457, K741, K393, R402, or N482. Insome embodiments, one or more mutation of an amino acid corresponding tothe following amino acids of PbCas13b: S658, N653, A656, K655, N652,H567, N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762,V795, A796, R791, G566, K590, R638, H452, S757, N756, N486, K484, N480,K457, or K741. In some embodiments, one or more mutation of an aminoacid corresponding to the following amino acids of PbCas13b: S658, N653,K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, R791, G566, K590, R638, H452, S757, N756, N486, K484, N480,K457, or K741. In some embodiments, one or more mutation of an aminoacid corresponding to the following amino acids of PbCas13b: N486, K484,N480, H452, N455, or K457.

In some embodiments, in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPbCas13b: N486, K484, N480, H452, N455, or K457. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of PbCas13b: K393, R402, N482, N486, K484, N480, H452, N455,or K457. In some embodiments, in the LID domain one or more mutation ofan amino acid corresponding to the following amino acids in the LIDdomain of PbCas13b: K393, R402, N482, N486, K484, N480, H452, N455, orK457. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: S658, N653,A656, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, V795, A796, R791, G566, K590, R638, H452, S757, N756,N486, K484, N480, K457, K741, K393, R402, or N482. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of PbCas13b: S658, N653, K655, N652, H567, N455, H500, K871,K857, K870, W842, E873, R877, K846, R874, R762, R791, G566, K590, R638,H452, S757, N756, N486, K484, N480, K457, K741, K393, R402, or N482.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):R53, Y164, K943, or R1041. In some embodiments, one or more mutation ofan amino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53 or Y164. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): K943 or R1041. In some embodiments,in a HEPN domain one or more mutation of an amino acid corresponding tothe following amino acids in a HEPN domain of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K943, or R1041. In some embodiments, in HEPNdomain 1 one or more mutation of an amino acid corresponding to thefollowing amino acids in HEPN domain 1 of Prevotella buccae Cas13b(PbCas13b): R53 or Y164. In some embodiments, in HEPN domain 2 one ormore mutation of an amino acid corresponding to the following aminoacids in HEPN domain 2 of Prevotella buccae Cas13b (PbCas13b): K943 orR1041. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K943, R1041, R56, N157, H161, R1068, N1069, orH1073. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53, Y164, R56, N157, or H161. In some embodiments, one ormore mutation of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): K943, R1041, R1068, N1069,or H1073. In some embodiments, in a HEPN domain one or more mutation ofan amino acid corresponding to the following amino acids in a HEPNdomain of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K943, R1041,R56, N157, H161, R1068, N1069, or H1073. In some embodiments, in HEPNdomain 1 one or more mutation of an amino acid corresponding to thefollowing amino acids in HEPN domain 1 of Prevotella buccae Cas13b(PbCas13b): R53, Y164, R56, N157, or H161. In some embodiments, in HEPNdomain 2 one or more mutation of an amino acid corresponding to thefollowing amino acids in HEPN domain 2 of Prevotella buccae Cas13b(PbCas13b): K943, R1041, R1068, N1069, or H1073. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K183,K193, K943, or R1041. In some embodiments, one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, K183, or K193. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): K943 or R1041.

In some embodiments, in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943, orR1041. In some embodiments, in HEPN domain 1 one or more mutation of anamino acid corresponding to the following amino acids in HEPN domain 1of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K183, or K193. Insome embodiments, in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943 or R1041. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K183,K193, K943, R1041, R56, N157, H161, R1068, N1069, or H1073. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K183, K193, R56, N157, or H161. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): K943, R1041, R1068, N1069, orH1073. In some embodiments, in a HEPN domain one or more mutation of anamino acid corresponding to the following amino acids in a HEPN domainof Prevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943,R1041, R56, N157, H161, R1068, N1069, or H1073. In some embodiments, inHEPN domain 1 one or more mutation of an amino acid corresponding to thefollowing amino acids in HEPN domain 1 of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K183, K193, R56, N157, or H161.

In some embodiments, in HEPN domain 2 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943, R1041, R1068, N1069, orH1073. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K183 or K193. In some embodiments, in HEPN domain 1 one ormore mutation of an amino acid corresponding to the following aminoacids in HEPN domain 1 of Prevotella buccae Cas13b (PbCas13b): K183 orK193. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K943, or R1041. In some embodiments, in a HEPNdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in a HEPN domain of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K943, or R1041. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): R53, K943, or R1041; preferablyR53A, R53K, R53D, or R53E; K943A, K943R, K943D, or K943E; or R1041A,R1041K, R1041D, or R1041E. In some embodiments, in a HEPN domain one ormore mutation of an amino acid corresponding to the following aminoacids in a HEPN domain of Prevotella buccae Cas13b (PbCas13b): R53,K943, or R1041; preferably R53A, R53K, R53D, or R53E; K943A, K943R,K943D, or K943E; or R1041A, R1041K, R1041D, or R1041E. In someembodiments, a mutation of an amino acid corresponding to amino acidY164 of Prevotella buccae Cas13b (PbCas13b), preferably Y164A, Y164F, orY164W.

In some embodiments, HEPN domain 1 a mutation of an amino acidcorresponding to amino acid Y164 HEPN domain 1 of Prevotella buccaeCas13b (PbCas13b), preferably Y164A, Y164F, or Y164W. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): T405,H407, K457, D434, K431, R402, K393, R482, N480, D396, E397, D398, orE399. In some embodiments, in the LID domain one or more mutation of anamino acid corresponding to the following amino acids in the LID domainof Prevotella buccae Cas13b (PbCas13b): T405, H407, K457, D434, K431,R402, K393, R482, N480, D396, E397, D398, or E399. In some embodiments,a mutation of an amino acid corresponding to amino acid H407 ofPrevotella buccae Cas13b (PbCas13b), preferably H407Y, H407W, or H407F.In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):R402, K393, R482, N480, D396, E397, D398, or E399. In some embodiments,in the LID domain one or more mutation of an amino acid corresponding tothe following amino acids in the LID domain of Prevotella buccae Cas13b(PbCas13b): R402, K393, R482, N480, D396, E397, D398, or E399. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K457,D434, or K431. In some embodiments, in the LID domain one or moremutation of an amino acid corresponding to the following amino acids inthe LID domain of Prevotella buccae Cas13b (PbCas13b): K457, D434, orK431.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):H500, K570, K590, N634, R638, N652, N653, K655, S658, K741, K744, N756,S757, R762, R791, K846, K857, K870, R877, R600, K607, K612, R614, K617,K826, K828, K829, R824, R830, Q831, K835, K836, R838, R618, Q646, N647,N653, or N652. In some embodiments, in a helical domain one or moremutation of an amino acid corresponding to the following amino acids ina helical domain of Prevotella buccae Cas13b (PbCas13b): H500, K570,K590, N634, R638, N652, N653, K655, S658, K741, K744, N756, S757, R762,R791, K846, K857, K870, R877, R600, K607, K612, R614, K617, K826, K828,K829, R824, R830, Q831, K835, K836, R838, R618, Q646, N647, N653, orN652. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500, K570, N756, S757, R762, R791, K846, K857, K870, R877,K826, K828, K829, R824, R830, Q831, K835, K836, or R838. In someembodiments, in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPrevotella buccae Cas13b (PbCas13b): H500, K570, N756, S757, R762, R791,K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831, K835, K836,or R838. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500, K570, N756, S757, R762, or R791. In some embodiments,in helical domain 1 one or more mutation of an amino acid correspondingto the following amino acids in helical domain 1 of Prevotella buccaeCas13b (PbCas13b): H500, K570, N756, S757, R762, or R791.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831, K835, K836,or R838. In some embodiments, in the bridge helix domain one or moremutation of an amino acid corresponding to the following amino acids inthe bridge helix domain of Prevotella buccae Cas13b (PbCas13b): K846,K857, K870, R877, K826, K828, K829, R824, R830, Q831, K835, K836, orR838. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500 or K570. In some embodiments, in helical domain 1-2 oneor more mutation of an amino acid corresponding to the following aminoacids in helical domain 1-2 of Prevotella buccae Cas13b (PbCas13b): H500or K570. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N756, S757, R762, R791, K846, K857, K870, R877, K826, K828,K829, R824, R830, Q831, K835, K836, or R838. In some embodiments, inhelical domain 1-3 one or more mutation of an amino acid correspondingto the following amino acids in helical domain 1-3 of Prevotella buccaeCas13b (PbCas13b): N756, S757, R762, R791, K846, K857, K870, R877, K826,K828, K829, R824, R830, Q831, K835, K836, or R838. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): N756, S757, R762, orR791. In some embodiments, in helical domain 1-3 one or more mutation ofan amino acid corresponding to the following amino acids in helicaldomain 1-3 of Prevotella buccae Cas13b (PbCas13b): N756, S757, R762, orR791. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N756, S757, R762, R791, K846, K857, K870, or R877. In someembodiments, in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPrevotella buccae Cas13b (PbCas13b): N756, S757, R762, R791, K846, K857,K870, or R877. In some embodiments, one or more mutation of an aminoacid corresponding to the following amino acids of Prevotella buccaeCas13b (PbCas13b): K826, K828, K829, R824, R830, Q831, K835, K836, orR838. In some embodiments, in helical domain 1-3 one or more mutation ofan amino acid corresponding to the following amino acids in helicaldomain 1-3 of Prevotella buccae Cas13b (PbCas13b): K826, K828, K829,R824, R830, Q831, K835, K836, or R838. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): K590, N634, R638, N652, N653, K655,S658, K741, K744, R600, K607, K612, R614, K617, R618, Q646, N647, N653,or N652. In some embodiments, in helical domain 2 one or more mutationof an amino acid corresponding to the following amino acids in helicaldomain 2 of Prevotella buccae Cas13b (PbCas13b): K590, N634, R638, N652,N653, K655, S658, K741, K744, R600, K607, K612, R614, K617, R618, Q646,N647, N653, or N652.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):Q646 or N647. In some embodiments, in helical domain 2 one or moremutation of an amino acid corresponding to the following amino acids inhelical domain 2 of Prevotella buccae Cas13b (PbCas13b): Q646 or N647.In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):N653 or N652. In some embodiments, in helical domain 2 one or moremutation of an amino acid corresponding to the following amino acids inhelical domain 2 of Prevotella buccae Cas13b (PbCas13b): N653 or N652.In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):K590, N634, R638, N652, N653, K655, S658, K741, or K744. In someembodiments, in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPrevotella buccae Cas13b (PbCas13b): K590, N634, R638, N652, N653, K655,S658, K741, or K744. In some embodiments, one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R600, K607, K612, R614, K617, or R618. In someembodiments, in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPrevotella buccae Cas13b (PbCas13b): R600, K607, K612, R614, K617, orR618. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R285, R287, K292, E296, N297, or K294. In some embodiments,in the IDL domain one or more mutation of an amino acid corresponding tothe following amino acids in the IDL domain of Prevotella buccae Cas13b(PbCas13b): R285, R287, K292, E296, N297, or K294. In some embodiments,one or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): R285, K292, E296, orN297. In some embodiments, in the IDL domain one or more mutation of anamino acid corresponding to the following amino acids in the IDL domainof Prevotella buccae Cas13b (PbCas13b): R285, K292, E296, or N297.

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):T405, H500, K570, K590, N634, R638, N652, N653, K655, S658, K741, K744,N756, S757, R762, R791, K846, K857, K870, R877, K183, K193, R600, K607,K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835, K836, R838,R618, D434, K431, R285, R287, K292, E296, N297, Q646, N647, or K294. Insome embodiments, one or more mutation of an amino acid corresponding tothe following amino acids of Prevotella buccae Cas13b (PbCas13b): R402,K393, N653, N652, R482, N480, D396, E397, D398, or E399. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, K655,R762, or R1041; preferably R53A or R53D; K655A; R762A; or R1041E orR1041D. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N297, E296, K292, or R285; preferably N297A, E296A, K292A,or R285A. In some embodiments, in (the central channel of) the IDLdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in (the central channel of) the IDL domain ofPrevotella buccae Cas13b (PbCas13b): N297, E296, K292, or R285;preferably N297A, E296A, K292A, or R285A. In some embodiments, one ormore mutation of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): Q831, K836, R838, N652,N653, R830, K655 or R762; preferably Q831A, K836A, R838A, N652A, N653A,R830A, K655A, or R762A. In some embodiments, one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): N652, N653, R830, K655 or R762; preferablyN652A, N653A, R830A, K655A, or R762A. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): K655 or R762; preferably K655A orR762A. In some embodiments, in a helical domain one or more mutation ofan amino acid corresponding to the following amino acids in a helicaldomain of Prevotella buccae Cas13b (PbCas13b): Q831, K836, R838, N652,N653, R830, K655 or R762; preferably Q831A, K836A, R838A, N652A, N653A,R830A, K655A, or R762A.

In some embodiments, a helical domain one or more mutation of an aminoacid corresponding to the following amino acids a helical domain ofPrevotella buccae Cas13b (PbCas13b): N652, N653, R830, K655 or R762;preferably N652A, N653A, R830A, K655A, or R762A. In some embodiments, inhelical domain 2 one or more mutation of an amino acid corresponding tothe following amino acids in helical domain 2 of Prevotella buccaeCas13b (PbCas13b): K655 or R762; preferably K655A or R762A. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R614,K607, K193, K183 or R600; preferably R614A, K607A, K193A, K183A orR600A. In some embodiments, in the trans-subunit loop of helical domain2 one or more mutation of an amino acid corresponding to the followingamino acids in the trans-subunit loop of helical domain 2 of Prevotellabuccae Cas13b (PbCas13b): Q646 or N647; preferably Q646A or N647A. Insome embodiments, one or more mutation of an amino acid corresponding tothe following amino acids of Prevotella buccae Cas13b (PbCas13b): R53 orR1041; preferably R53A or R53D, or R1041E or R1041D. In someembodiments, in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53 or R1041; preferably R53A orR53D, or R1041E or R1041D. In some embodiments, one or more mutation ofan amino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K457, D397, E398, D399, E400, T405, H407 orD434; preferably D397A, E398A, D399A, E400A, T405A, H407A, H407W, H407Y,H407F or D434A. In some embodiments, in the LID domain one or moremutation of an amino acid corresponding to the following amino acids inthe LID domain of Prevotella buccae Cas13b (PbCas13b): K457, D397, E398,D399, E400, T405, H407 or D434; preferably D397A, E398A, D399A, E400A,T405A, H407A, H407W, H407Y, H407F or D434A. In some embodiments, theamino acids correspond to the following amino acids of Prevotella buccaeCas13b (PbCas13b): amino acids 46-57, 73-79, 152-164, 1036-1046, and1064-1074. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R156, N157, H161, R1068, N1069, and H1073. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R285,R287, K292, K294, E296, and N297. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): K826, K828, K829, R824, R830, Q831,K835, K836, and R838. In some embodiments, one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): T405, H407, K457, H500, K570, K590, N634,R638, N652, N653, K655, S658, K741, K744, N756, S757, R762, R791, K846,K857, K870, and R877.

In some embodiments, a mutation of an amino acid corresponding to aminoacid T405 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid H407 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid K457 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid H500 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K570 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid K590 of Prevotellabuccae Cas13b (PbCas13b).

In some embodiments, a mutation of an amino acid corresponding to aminoacid N634 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid R638 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid N652 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid N653 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K655 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid S658 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid K741 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid K744 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid N756 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid S757 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid R762 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid R791 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K846 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid K857 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid K870 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid R877 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K183 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid K193 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid R600 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid K607 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K612 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid R614 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid K617 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid K826 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K828 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid K829 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid R824 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid R830 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid Q831 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid K835 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid K836 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid R838 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid R618 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid D434 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid K431 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid R53 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K943 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid R1041 ofPrevotella buccae Cas13b (PbCas13b). In some embodiments, a mutation ofan amino acid corresponding to amino acid Y164 of Prevotella buccaeCas13b (PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid R285 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid R287 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid K292 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid E296 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid N297 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid Q646 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid N647 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid R402 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid K393 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid N653 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid N652 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid R482 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid N480 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid D396 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid E397 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid D398 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid E399 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid K294 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid E400 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid R56 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid N157 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid H161 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid H452 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid N455 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid K484 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid N486 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid G566 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid H567 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid A656 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid V795 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid A796 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid W842 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid K871 of Prevotella buccae Cas13b (PbCas13b).In some embodiments, a mutation of an amino acid corresponding to aminoacid E873 of Prevotella buccae Cas13b (PbCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid R874 of Prevotellabuccae Cas13b (PbCas13b). In some embodiments, a mutation of an aminoacid corresponding to amino acid R1068 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid N1069 of Prevotella buccae Cas13b(PbCas13b). In some embodiments, a mutation of an amino acidcorresponding to amino acid H1073 of Prevotella buccae Cas13b(PbCas13b).

In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Leptotrichia shahii Cas13a (LshCas13a):R597, N598, H602, R1278, N1279, or H1283. In some embodiments, one ormore mutation of an amino acid corresponding to the following aminoacids of Leptotrichia shahii Cas13a (LshCas13a): R597, N598, H602,R1278, N1279, or H1283. In some embodiments, in a HEPN domain one ormore mutation of an amino acid corresponding to the following aminoacids in a HEPN domain of Leptotrichia shahii Cas13a (LshCas13a): R597,N598, H602, R1278, N1279, or H1283. In some embodiments, one or moremutation of an amino acid corresponding to the following amino acids ofLeptotrichia shahii Cas13a (LshCas13a): R597, N598, or H602. In someembodiments, in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofLeptotrichia shahii Cas13a (LshCas13a): R597, N598, or H602. In someembodiments, one or more mutation of an amino acid corresponding to thefollowing amino acids of Leptotrichia shahii Cas13a (LshCas13a): R1278,N1279, or H1283. In some embodiments, in HEPN domain 2 one or moremutation of an amino acid corresponding to the following amino acids inHEPN domain 2 of Leptotrichia shahii Cas13a (LshCas13a): R1278, N1279,or H1283. In some embodiments, one or more mutation of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R146, H151, R1116, or H1121. In some embodiments, one ormore mutation of an amino acid corresponding to the following aminoacids of Porphyromonas gulae Cas13b (PguCas13b): R146, H151, R1116, orH1121. In some embodiments, in a HEPN domain one or more mutation of anamino acid corresponding to the following amino acids in a HEPN domainof Porphyromonas gulae Cas13b (PguCas13b): R146, H151, R1116, or H1121.In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Porphyromonas gulae Cas13b (PguCas13b):R146 or H151. In some embodiments, in HEPN domain 1 one or more mutationof an amino acid corresponding to the following amino acids in HEPNdomain 1 of Porphyromonas gulae Cas13b (PguCas13b): R146 or H151. Insome embodiments, one or more mutation of an amino acid corresponding tothe following amino acids of Porphyromonas gulae Cas13b (PguCas13b):R1116 or H1121. In some embodiments, in HEPN domain 2 one or moremutation of an amino acid corresponding to the following amino acids inHEPN domain 2 of Porphyromonas gulae Cas13b (PguCas13b): R1116 or H1121.In some embodiments, one or more mutation of an amino acid correspondingto the following amino acids of Prevotella sp. P5-125 Cas13b(PspCas13b): H133 or H1058. In some embodiments, one or more mutation ofan amino acid corresponding to the following amino acids of Prevotellasp. P5-125 Cas13b (PspCas13b): H133 or H1058. In some embodiments, in aHEPN domain one or more mutation of an amino acid corresponding to thefollowing amino acids in a HEPN domain of Prevotella sp. P5-125 Cas13b(PspCas13b): H133 or H1058. In some embodiments, a mutation of an aminoacid corresponding to amino acid H133 of Prevotella sp. P5-125 Cas13b(PspCas13b). In some embodiments, in HEPN domain 1 a mutation of anamino acid corresponding to amino acid H133 in HEPN domain 1 ofPrevotella sp. P5-125 Cas13b (PspCas13b). In some embodiments, amutation of an amino acid corresponding to amino acid H1058 ofPrevotella sp. P5-125 Cas13b (PspCas13b). In some embodiments, in HEPNdomain 2 a mutation of an amino acid corresponding to the amino acidH1058 in HEPN domain 2 of Prevotella sp. P5-125 Cas13b (PspCas13b).

In some embodiments, the amino acid is mutated to A, P, or V, preferablyA. In some embodiments, said amino acid is mutated to a hydrophobicamino acid. In some embodiments, said amino acid is mutated to anaromatic amino acid. In some embodiments, said amino acid is mutated toa charged amino acid. In some embodiments, said amino acid is mutated toa positively charged amino acid. In some embodiments, said amino acid ismutated to a negatively charged amino acid. In some embodiments, saidamino acid is mutated to a polar amino acid. In some embodiments, saidamino acid is mutated to an aliphatic amino acid. In some embodiments,the engineered CRISPR-Cas protein further comprises a functionalheterologous domain.

In some embodiments, the Cas13 protein is from a species of the genusAlistipes, Anaerosalibacter, Bacteroides, Bacteroidetes, Bergeyella,Blautia, Butyrivibrio, Capnocytophaga, Carnobacterium, Chloroflexus,Chryseobacterium, Clostridium, Demequina, Eubacteriaceae, Eubacterium,Flavobacterium, Fusobacterium, Herbinix, Insolitispirillum,Lachnospiraceae, Leptotrichia, Listeria, Myroides, Paludibacter,Phaeodactylibacter, Porphyromonadaceae, Porphyromonas, Prevotella,Pseudobutyrivibrio, Psychroflexus, Reichenbachiella, Rhodobacter,Riemerella, Sinomicrobium, Thalassospira, Ruminococcus; preferablyLeptotrichia shahii, Listeria seeligeri, Lachnospiraceae bacterium (suchas Lb MA2020, Lb NK4A179, Lb NK4A144), Clostridium aminophilum (such asCa DSM 10710), Carnobacterium gallinarum (such as Cg DSM 4847),Paludibacter propionicigenes (such as Pp WB4), Listeriaweihenstephanensis (such as Lw FSL R9-0317), Listeriaceae bacterium(such as Lb FSL M6-0635), Leptotrichia wadei (such as Lw F0279),Rhodobacter capsulatus (such as Rc SB 1003, Rc R121, Rc DE442),Leptotrichia buccalis (such as Lb C-1013-b), Herbinixhemicellulosilytica, Eubacteriaceae bacterium (such as Eb CHKCI004),Blautia. sp Marseille-P2398, Leptotrichia sp. oral taxon 879 str. F0557,Chloroflexus aggregans, Demequina aurantiaca, Thalassospira sp. TSL5-1,Pseudobutyrivibrio sp. OR37, Butyrivibrio sp. YAB3001, Leptotrichia sp.Marseille-P3007, Bacteroides ihuae, Porphyromonadaceae bacterium (suchas Pb KH3CP3RA), Listeria riparia, Insolitispirillum peregrinum,Alistipes sp. ZOR0009, Bacteroides pyogenes (such as Bp F0041),Bacteroidetes bacterium (such as Bb GWA2_31_9), Bergeyella zoohelcum(such as Bz ATCC 43767), Capnocytophaga canimorsus, Capnocytophagacynodegmi, Chryseobacterium carnipullorum, Chryseobacterium jejuense,Chryseobacterium ureilyticum, Flavobacterium branchiophilum,Flavobacterium columnare, Flavobacterium sp. 316, Myroides odoratimimus(such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG 3837), Paludibacterpropionicigenes, Phaeodactylibacter xiamenensis, Porphyromonasgingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, Pg W4087,Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60, Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, Sinomicrobium oceani,Fusobacterium necrophorum (such as Fn subsp. funduliforme ATCC 51357, FnDJ-2, Fn BFTR-1, Fn subsp. funduliforme), Fusobacterium perfoetens (suchas Fp ATCC 29250), Fusobacterium ulcerans (such as Fu ATCC 49185),Anaerosalibacter sp. ND1, Eubacterium siraeum, Ruminococcus flavefaciens(such as Rfx XPD3002), or Ruminococcus albus.

In some embodiments, the Cas13 protein is a Cas13a protein.

In some embodiments, the Cas13a protein is from a species of the genusBacteroides, Blautia, Butyrivibrio, Carnobacterium, Chloroflexus,Clostridium, Demequina, Eubacterium, Herbinix, Insolitispirillum,Lachnospiraceae, Leptotrichia, Listeria, Paludibacter,Porphyromonadaceae, Pseudobutyrivibrio, Rhodobacter, or Thalassospira;preferably Leptotrichia shahii, Listeria seeligeri, Lachnospiraceaebacterium (such as Lb MA2020, Lb NK4A179, Lb NK4A144), Clostridiumaminophilum (such as Ca DSM 10710), Carnobacterium gallinarum (such asCg DSM 4847), Paludibacter propionicigenes (such as Pp WB4), Listeriaweihenstephanensis (such as Lw FSL R9-0317), Listeriaceae bacterium(such as Lb FSL M6-0635), Leptotrichia wadei (such as Lw F0279),Rhodobacter capsulatus (such as Rc SB 1003, Rc R121, Rc DE442),Leptotrichia buccalis (such as Lb C-1013-b), Herbinixhemicellulosilytica, Eubacteriaceae bacterium (such as Eb CHKCI004),Blautia. sp Marseille-P2398, Leptotrichia sp. oral taxon 879 str. F0557,Chloroflexus aggregans, Demequina aurantiaca, Thalassospira sp. TSL5-1,Pseudobutyrivibrio sp. OR37, Butyrivibrio sp. YAB3001, Leptotrichia sp.Marseille-P3007, Bacteroides ihuae, Porphyromonadaceae bacterium (suchas Pb KH3CP3RA), Listeria riparia, or Insolitispirillum peregrinum.

In some embodiments, the Cas13 protein is a Cas13b protein.

In some embodiments, the Cas13b protein is from a species of the genusAlistipes, Bacteroides, Bacteroidetes, Bergeyella, Capnocytophaga,Chryseobacterium, Flavobacterium, Myroides, Paludibacter,Phaeodactylibacter, Porphyromonas, Prevotella, Psychroflexus,Reichenbachiella, Riemerella, or Sinomicrobium; preferably Alistipes sp.ZOR0009, Bacteroides pyogenes (such as Bp F0041), Bacteroidetesbacterium (such as Bb GWA2_31_9), Bergeyella zoohelcum (such as Bz ATCC43767), Capnocytophaga canimorsus, Capnocytophaga cynodegmi,Chryseobacterium carnipullorum, Chryseobacterium jejuense,Chryseobacterium ureilyticum, Flavobacterium branchiophilum,Flavobacterium columnare, Flavobacterium sp. 316, Myroides odoratimimus(such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG 3837), Paludibacterpropionicigenes, Phaeodactylibacter xiamenensis, Porphyromonasgingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, Pg W4087,Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60, Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, or Sinomicrobium oceani.

In some embodiments, the Cas13 protein is a Cas13c protein.

In some embodiments, the Cas13c protein is from a species of the genusFusobacterium or Anaerosalibacter; preferably Fusobacterium necrophorum(such as Fn subsp. funduliforme ATCC 51357, Fn DJ-2, Fn BFTR-1, Fnsubsp. funduliforme), Fusobacterium perfoetens (such as Fp ATCC 29250),Fusobacterium ulcerans (such as Fu ATCC 49185), or Anaerosalibacter sp.ND1.

In some embodiments, the Cas13 protein is a Cas13d protein.

In some embodiments, the Cas13d protein is from a species of the genusEubacterium or Ruminococcus, preferably Eubacterium siraeum,Ruminococcus flavefaciens (such as Rfx XPD3002), or Ruminococcus albus.

In some embodiments, the catalytic activity of the engineered CRISPR-Casprotein is increased as compared to a corresponding wildtype CRISPR-Casprotein. In some embodiments, the catalytic activity of the engineeredCRISPR-Cas protein is decreased as compared to a corresponding wildtypeCRISPR-Cas protein. In some embodiments, the gRNA binding of theengineered CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, the gRNAbinding of the engineered CRISPR-Cas protein is decreased as compared toa corresponding wildtype CRISPR-Cas protein. In some embodiments, thespecificity of the CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, thespecificity of the CRISPR-Cas protein is decreased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, thestability of the CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, thestability of the CRISPR-Cas protein is decreased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, theengineered CRISPR-Cas protein further comprises one or more mutationswhich inactivate catalytic activity. In some embodiments, the off-targetbinding of the CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, theoff-target binding of the CRISPR-Cas protein is decreased as compared toa corresponding wildtype CRISPR-Cas protein. In some embodiments, thetarget binding of the CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, thetarget binding of the CRISPR-Cas protein is decreased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, theengineered CRISPR-Cas protein has a higher protease activity orpolynucleotide-binding capability compared with a corresponding wildtypeCRISPR-Cas protein. In some embodiments, PFS recognition is altered ascompared to a corresponding wildtype CRISPR-Cas protein. In someembodiments, the engineered CRISPR-Cas protein further comprises afunctional heterologous domain. In some embodiments, the engineeredCRISPR-Cas protein further comprises an NLS.

In another aspect, the present disclosure provides one or more HEPNdomains and is less than 1000 amino acids in length. In someembodiments, the protein is less than 950, less than 900, less than 850,less than 800, less, or than 750 amino acids in size. In someembodiments, the HEPN domain comprises RxxxxH motif sequence. In someembodiments, the RxxxxH motif comprises a R[N/H/K]X₁X₂X₃H sequence. Insome embodiments, X₁ is R, S, D, E, Q, N, G, or Y, X₂ is independentlyI, S, T, V, or L, and X₃ is independently L, F, N, Y, V, I, S, D, E, orA. In some embodiments, the CRISPR-Cas protein is a Type VI CRISPR Casprotein. In some embodiments, the Type VI CRISPR Cas protein is aCas13a, a Cas13b, a Cas13c, or a Cas13d. In some embodiments, theCRISPR-Cas protein is associated with a functional domain. In someembodiments, the CRISPR-Cas protein comprises one or more mutationsequivalent to mutations described herein. In some embodiments, theCRISPR-Cas protein comprises one or more mutations in the helicaldomain. In some embodiments, the CRISPR-Cas protein is in a dead form orhas nickase activity.

In another aspect, the present disclosure provides a polynucleic acidencoding the engineered CRISPR-Cas protein herein. In some embodiments,the polynucleic acid is codon optimized.

In another aspect, the present disclosure provides a CRISPR-Cas systemcomprising the engineered CRISPR-Cas protein herein or thepolynucleotide herein, and a nucleotide component capable of forming acomplex with the engineered CRISPR-Cas protein and able to hybridizewith a target nucleic acid sequence and direct sequence-specific bindingof said complex to the target nucleic acid sequence.

In another aspect, the present disclosure provides a vector systemcomprising one or more vectors, the one or more vectors comprising oneor more polynucleotide molecules encoding components of the engineeredCRISPR-Cas protein.

In another aspect, the present disclosure provides a method of modifyinga target nucleic acid comprising: introducing in a cell or organism thatcomprises the target nucleic acid, the engineered CRISPR-Cas protein,the polynucleotide, the CRISPR-Cas system, or the vector or vectorsystem described herein, such that the engineered CRISPR-Cas proteinmodifies the target nucleic acid in the cell or organism.

In some embodiments, the engineered CRISPR-Cas system is introduced viadelivery by liposomes, nanoparticles, exosomes, microvesicles, nucleicacid nanoassemblies, a gene gun, an implantable device, or the vectorsystem herein. In some embodiments, the engineered CRISPR-cas protein isassociated with one or more functional domains. In some embodiments, thetarget nucleic acid comprises a genomic locus, and the engineeredCRISPR-Cas protein modifies gene product encoded at the genomic locus orexpression of the gene product. In some embodiments, the target nucleicacid is DNA or RNA and wherein one or more nucleotides in the targetnucleic acid are base edited. In some embodiments, the target nucleicacid is DNA or RNA and wherein the target nucleic acid is cleaved. Insome embodiments, the engineered CRISPR-Cas protein further cleavesnon-target nucleic acid. In some embodiments, the method furthercomprises visualizing activity and, optionally, using a detectablelabel. In some embodiments, the method further comprises detectingbinding of one or more components of the CRISPR-Cas system to the targetnucleic acid. In some embodiments, said cell or organisms is aeukaryotic cell or organism. In some embodiments, said cell or organismsis an animal cell or organism. In some embodiments, said cell ororganisms is a plant cell or organism.

In another aspect, the present disclosure provides method for detectinga target nucleic acid in a sample comprising: contacting a sample with:an engineered CRISPR-Cas protein herein; at least one guidepolynucleotide comprising a guide sequence capable of binding to thetarget nucleic acid and designed to form a complex with the engineeredCRISPR-Cas; and a RNA-based masking construct comprising a non-targetsequence; wherein the engineered CRISPR-Cas protein exhibits collateralRNase activity and cleaves the non-target sequence of the detectionconstruct; and detecting a signal from cleavage of the non-targetsequence, thereby detecting the target nucleic acid in the sample.

In some embodiments, the method further comprises contacting the samplewith reagents for amplifying the target nucleic acid. In someembodiments, the reagents for amplifying comprises isothermalamplification reaction reagents. In some embodiments, the isothermalamplification reagents comprise nucleic-acid sequence-basedamplification, recombinase polymerase amplification, loop-mediatedisothermal amplification, strand displacement amplification,helicase-dependent amplification, or nicking enzyme amplificationreagents. In some embodiments, the target nucleic acid is DNA moleculeand the method further comprises contacting the target DNA molecule witha primer comprising an RNA polymerase site and RNA polymerase. In someembodiments, the masking construct: suppresses generation of adetectable positive signal until the masking construct cleaved ordeactivated, or masks a detectable positive signal or generates adetectable negative signal until the masking construct cleaved ordeactivated.

In some embodiments, the masking construct comprises: a. a silencing RNAthat suppresses generation of a gene product encoded by a reportingconstruct, wherein the gene product generates the detectable positivesignal when expressed; b. a ribozyme that generates the negativedetectable signal, and wherein the positive detectable signal isgenerated when the ribozyme is deactivated; or c. a ribozyme thatconverts a substrate to a first color and wherein the substrate convertsto a second color when the ribozyme is deactivated; d. an aptamer and/orcomprises a polynucleotide-tethered inhibitor; e. a polynucleotide towhich a detectable ligand and a masking component are attached; f. ananoparticle held in aggregate by bridge molecules, wherein at least aportion of the bridge molecules comprises a polynucleotide, and whereinthe solution undergoes a color shift when the nanoparticle is disbursedin solution; g. a quantum dot or fluorophore linked to one or morequencher molecules by a linking molecule, wherein at least a portion ofthe linking molecule comprises a polynucleotide; h. a polynucleotide incomplex with an intercalating agent, wherein the intercalating agentchanges absorbance upon cleavage of the polynucleotide; or l. twofluorophores tethered by a polynucleotide that undergo a shift influorescence when released from the polynucleotide.

In some embodiments, the aptamer a. comprises a polynucleotide-tetheredinhibitor that sequesters an enzyme, wherein the enzyme generates adetectable signal upon release from the aptamer orpolynucleotidetethered inhibitor by acting upon a substrate; or b. is aninhibitory aptamer that inhibits an enzyme and prevents the enzyme fromcatalyzing generation of a detectable signal from a substrate or whereinthe polynucleotidetethered inhibitor inhibits an enzyme and prevents theenzyme from catalyzing generation of a detectable signal from asubstrate; or c. sequesters a pair of agents that when released from theaptamers combine to generate a detectable signal. In some embodiments,the nanoparticle is a colloidal metal. In some embodiments, the at leastone guide polynucleotide comprises a mismatch. In some embodiments, themismatch is up- or downstream of a single nucleotide variation on theone or more guide sequences.

In another aspect, the present disclosure provides a cell or organismcomprising the engineered CRISPR-Cas protein herein, the polynucleicacid herein, the CRISPR-Cas system, or the vector or vector systemherein.

In another aspect, the present disclosure provides an engineeredadenosine deaminase comprising one or more mutations, wherein theengineered adenosine deaminase has cytidine deaminase activity.

In some embodiments, the engineered adenosine deaminase has adenosinedeaminase activity. In some embodiments, the engineered adenosinedeaminase is a portion of a fusion protein. In some embodiments, thefusion protein comprises a functional domain. In some embodiments, thefunctional domain is capable of directing the engineered adenosinedeaminase to bind to a target nucleic acid. In some embodiments, thefunctional domain is a CRISPR-Cas protein herein. In some embodiments,the CRISPR-Cas protein is a dead form CRISPR-Cas protein or CRISPR-Casnickase protein. In some embodiments, the one or more mutationscomprises: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I,I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418E, S661T based onamino acid sequence positions of hADAR2-D, and corresponding mutationsin a homologous ADAR protein. In some embodiments, the one or moremutations comprises: E488Q, V351G, S486A, T375S, S370C, P462A, N597I,L332I, I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418E, and S661Tbased on amino acid sequence positions of hADAR2-D, and correspondingmutations in a homologous ADAR protein.

In another aspect, the present disclosure provides a polynucleotideencoding the engineered adenosine deaminase, or a catalytic domainthereof. In another aspect, the present disclosure provides comprisingthe polynucleotide.

In another aspect, the present disclosure provides a pharmaceuticalcomposition comprising the engineered adenosine deaminase or a catalyticdomain thereof formulated for delivery by liposomes, nanoparticles,exosomes, microvesicles, nucleic acid nanoassemblies, a gene gun, or animplantable device.

In another aspect, the present disclosure an engineered cell expressingthe engineered adenosine deaminase or a catalytic domain thereof. Insome embodiments, the cell transiently expresses the engineeredadenosine deaminase or the catalytic domain thereof. In someembodiments, the cell non-transiently expresses the engineered adenosinedeaminase or the catalytic domain thereof.

An another aspect, the present disclosure provides an engineered,non-naturally occurring system for modifying nucleotides in a targetnucleic acid, comprising a) a dead CRISPR-Cas or CRISPR-Cas nickaseprotein, or a nucleotide sequence encoding said dead Cas or Cas nickaseprotein; b) a guide molecule comprising a guide sequence that hybridizesto a target sequence and designed to form a complex with the deadCRISPR-Cas or CRISPR-Cas nickase protein; and c) a nucleotide deaminaseprotein or catalytic domain thereof, or a nucleotide sequence encodingsaid nucleotide deaminase protein or catalytic domain thereof, whereinsaid nucleotide deaminase protein or catalytic domain thereof iscovalently or non-covalently linked to said dead CRISPR-Cas orCRISPR-Cas nickase protein or said guide molecule is adapted to linkthereof after delivery.

In some embodiments, said adenosine deaminase protein or catalyticdomain thereof comprises one or more of the mutations: E488Q, V351G,S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I, M383L, D619G,S582T, V440I, S495N, K418E, S661T based on amino acid sequence positionsof hADAR2-D, and corresponding mutations in a homologous ADAR protein.In some embodiments, said adenosine deaminase protein or catalyticdomain thereof comprises mutations: E488Q, V351G, S486A, T375S, S370C,P462A, N597I, L332I, I398V, K350I, M383L, D619G, S582T, V440I, S495N,K418E, and S661T based on amino acid sequence positions of hADAR2-D, andcorresponding mutations in a homologous ADAR protein.

In some embodiments, the CRISPR-Cas protein is Cas9, Cas12, Cas13, Cas14, CasX, CasY. In some embodiments, the CRISPR-Cas protein is Cas13b.In some embodiments, the CRISPR-Cas protein is Cas13b-t1, Cas13b-t2, orCas13b-t3. In some embodiments, he CRISPR-Cas is an engineeredCRISPR-Cas protein.

In another aspect, the present disclosure provides a method formodifying nucleotide in a target nucleic acid, comprising: delivering tosaid target nucleic acid the engineered adenosine deaminase, or thesystem, wherein the deaminase deaminates a nucleotide at one or moretarget loci on the target nucleic acid.

In some embodiments, said nucleotide deaminase protein or catalyticdomain thereof has been modified to increase activity against a DNA-RNAheteroduplex. In some embodiments, said nucleotide deaminase protein orcatalytic domain thereof has been modified to reduce off-target effects.In some embodiments, the target nucleic acid is within a cell. In someembodiments, said cell is a eukaryotic cell. In some embodiments, saidcell is a non-human animal cell. In some embodiments, said cell is ahuman cell. In some embodiments, said cell is a plant cell. In someembodiments, said target nucleic acid is within an animal. In someembodiments, said target nucleic acid is within a plant. In someembodiments, said target nucleic acid is comprised in a DNA molecule invitro. In some embodiments, the engineered adenosine deaminase, or oneor more components of the system are delivered to the cell as aribonucleoprotein complex. In some embodiments, the engineered adenosinedeaminase, or one or more components of the system are delivered via oneor more particles, one or more vesicles, or one or more viral vectors.In some embodiments, said one or more particles comprise a lipid, asugar, a metal or a protein. In some embodiments, said one or moreparticles comprise lipid nanoparticles. In some embodiments, said one ormore vesicles comprise exosomes or liposomes. In some embodiments, saidone or more viral vectors comprise one or more adenoviral vectors, oneor more lentiviral vectors, or one or more adeno-associated viralvectors. In some embodiments, said method modifies a cell, a cell lineor an organism by manipulation of one or more target sequences atgenomic loci of interest. In some embodiments, said deamination of saidnucleotide at said target locus of interest remedies a disease caused bya G→A or C→T point mutation or a pathogenic SNP. In some embodiments,said disease is selected from cancer, haemophilia, beta-thalassemia,Marfan syndrome and Wiskott-Aldrich syndrome. In some embodiments, saiddeamination of said nucleotide at said target locus of interest remediesa disease caused by a T→C or A→G point mutation or a pathogenic SNP. Insome embodiments, said deamination of said nucleotide at said targetlocus of interest inactivates a target gene at said target locus. Insome embodiments, the engineered adenosine deaminase, or one or morecomponents of the system are delivered by liposomes, nanoparticles,exosomes, microvesicles, nucleic acid nanoassemblies, a gene gun, animplantable device, or the vector system. In some embodiments,modification of the nucleotide modifies gene product encoded at thetarget locus or expression of the gene product.

These and other aspects, objects, features, and advantages of theexample embodiments will become apparent to those having ordinary skillin the art upon consideration of the following detailed description ofillustrated example embodiments.

BRIEF DESCRIPTION OF THE DRAWINGS

An understanding of the features and advantages of the present inventionwill be obtained by reference to the following detailed description thatsets forth illustrative embodiments, in which the principles of theinvention may be utilized, and the accompanying drawings of which:

FIGS. 1A-1D. The crystal structure of PbuCas13b-crRNA Binary Complex.(FIG. 1A) Linear domain organization of PbuCas13b. Active sitepositioning is denoted by asterisks. (FIG. 1B) crRNA hairpin in complexwith PbuCas13b. (FIG. 1C) Overall structure of PbuCas13b. Two views arerotated 180 degrees from each other. Domains are colored consistent withthe linear domain map. crRNA is colored red. (FIG. 1D) Space-fillingmodel of PbuCas13b, each view rotated 180 degrees from each other.

FIGS. 2A-2E. PbuCas13b crRNA recognition. (FIG. 2A) Diagram of PbCas13bcrRNA (SEQ ID NO:1). Direct repeat residues are colored red, and spacerresidues in light blue. (FIG. 2B) Positioning of the 3′ end of the crRNAnear K393 and coordinating residues within PbuCas13b. (FIG. 2C)Structure of the crRNA within the PbuCas13b complex. Coloring isconsistent with panel (FIG. 2A). (FIG. 2D) Base identity swapping. Upperpanel, nuclease activity; lower panel, thermal stability. Hashed filldenotes wild type base identities. (FIG. 2E) Mutagenesis of Lid domainresidues that coordinate and process crRNA within PbuCas13b. Upperpanel, RNase activity in SHERLOCK reaction; lower panel, crRNAprocessing. Cleavage bands and expected sizes are indicated by redmarkers, ladder with sizes are shown on left.

FIG. 3. Schematic view of the intermolecular contacts between PbuCas13band crRNA (SEQ ID NO:2).

FIGS. 4A-4C. PbuCas13b comparison to LshCas13a architecture and activesite. (FIG. 4A) Linear comparison of domain organization of PbuCas13band LshCas13a (pdb 5wtk). crRNAs are shown to the right. (FIG. 4B) Twoviews of PbuCas13b rotated 90 degrees. Inset is zoomed in on active siteresidues in the same orientation as in (FIG. 4C). (FIG. 4C) LshCas13acolored consistently with (FIG. 4A). Homologous residues are labeled.

FIGS. 5A-5H. Site-directed mutagenesis of PbuCas13b; RNA interference inmammalian cell. (FIG. 5A) Effect of all PbuCas13b site-directedmutations on RNA interference in mammalian cells. Strongest interferenceknockdowns are colored in light blue. (FIG. 5B) PbuCas13b with strongmutations labeled and colored in red. (FIGS. 5C-5H) Mutations separatedby region.

FIGS. 6A-6D. (FIG. 6A) Surface electrostatics of PbuCas13b. (FIG. 6B)Surface electrostatics of PbuCas13b rotated 180 degrees from panel A.(FIG. 6C) Surface electrostatics of PbuCas13b with the Lid domainremoved, showing the inner positively charged channel. (FIG. 6D) Surfaceelectrostatics of the putative crRNA processing active site.

FIG. 7. REPAIR assay of pgCas13b C-terminal truncations.

FIGS. 8A-8G. (FIG. 8A) PbuCas13b direct repeat structure. (FIG. 8B)Ideal A-form RNA. (FIG. 8C) Diagram of direct repeat base pairing andsecondary structure (SEQ ID NO:3). (FIG. 8D) Multiplete one. (FIG. 8E)Multiplete two. (FIG. 8F) Multiplete three. (FIG. 8G) Alignment ofPbuCas13b direct repeat sequences (SEQ ID NOs:4-9). Asterix denoteconserved nucleotides.

FIG. 9. Expanded data for cleavage activity of PbuCas13 with mutatedcrRNA, and thermal stability of crRNA mutants.

FIGS. 10A-10D. (FIG. 10A) Schematic of crRNA substrate for processingassay (SEQ ID NOs:10-11). (FIG. 10B) Gel showing complementary DR is notprocessed. (FIG. 10C) crRNA processing by mutants of PbuCas13b. (FIG.10D) SHERLOCK assay measuring general RNase activity.

FIGS. 11A-11C. Melting curves of PbuCas13b with substrate RNA andMagnesium ions. (FIG. 11A) The effect of RNA substrate on PbuCas13bthermal stability. (FIG. 11B) The effect of PbuCas13b RNA cleavage andthermal stability. (FIG. 11C) The effect of magnesium on PbuCas13bthermal stability.

FIG. 12. Limited proteolysis of PbuCas13b with RNA substrate. Limitedproteolysis of PbuCas13b. T=Trypsin, C=Chymotrypsin, P=Pepsin

FIGS. 13A-13C. Cas13b bridge-helix. (FIG. 13A) Cas13b with bridge-helixhighlighted in red. RNA is colored in pink. (FIG. 13B) Cas12(Cpf1) withbridge-helix highlighted in cyan. RNA is colored in light blue, DNA darkblue. (FIG. 13C) Manual sequence alignment of bridge helix fromPbuCas13b and LbCas12 (SEQ ID NOs:12-13).

FIG. 14. Cas13b Neighbor-joining tree of all Cas13b family members.Inset, Cas13b subset with PbuCas13b (bolded).

FIG. 15. Structure based alignment of Cas13b subgroup (SEQ IDNOs:14-22).

FIG. 16. Structure based alignment of all Cas13bs (SEQ ID NOs:23-37).

FIGS. 17A-17D. Raw uncropped images of all gels shown in figures. (FIG.17A) crRNA processing gel1. (FIG. 17B) crRNA processing gel2. (FIG. 17C)crRNA processing gel3. (FIG. 17D) limited proteolysis gel.

FIG. 18. Grouped topology map of PbuCas13b crystal structure.

FIG. 19 shows a pymol file that shows a position of the coordinatednucleotide in the active site of Cas13b.

FIG. 20 shows an exemplary RNA loop extension.

FIG. 21 shows exemplary fusion points via which a nucleotide deaminaseis linked to a Cas13b.

FIG. 22 shows screening for mutations for RESCUE v9.

FIG. 23 shows validation of RESCUEv9's effect on T-flip guides.

FIG. 24 shows validation of RESCUEv9's effect on C-flip guides.

FIG. 25 shows performance of RESCUEv9 on endogenous targeting.

FIG. 26 shows screening for mutations for RESCUEv10.

FIG. 27 shows test results of 30-bp guides for C-flips.

FIG. 28 shows Gluc/Cluc results from comparison between Cas13b6 andCas13b12 with RESCUE v1 through v8.

FIG. 29 shows fraction editing results from comparison between Cas13b6and Cas13b12 with RESCUE v1 through v8.

FIG. 30 shows effects on endogenous targeting (T-flips) results fromcomparison between Cas13b6 and Cas13b12 with RESCUEv8.

FIG. 31 shows effects of RESCUEs on base converting.

FIG. 32 shows test results of CCN 3′ motif targeting.

FIG. 33A shows a schematic of constructs with dCas13b fused with ADAR.FIG. 33B shows test results of the constructs.

FIG. 34 shows sequencing of the N-terminal tag and linkers.

FIG. 35 shows quantification of off-targets.

FIG. 36 shows testing of off-target edits.

FIG. 37 shows test results of endogenous genes targets with(GGS)2/Q507R.

FIG. 38 and FIG. 39 show eGFP screening of mutations on (GGS)2/Q507R.

FIG. 40A shows constructs with Cas13b truncation. FIG. 40B shows testresults of the constructs.

FIG. 41 shows multiplexed on/off-target guides for screening (SEQ IDNOs:38-39).

FIGS. 42A-42E show validation tests on RESCUEv10. FIG. 42A showsvalidation of RESCUEv10 (Rounds 50, 52). FIG. 42B shows validation ofRESCUEv10 (Rounds 53, 54).

FIG. 42C shows validation of RESCUEv10 (Rounds 58). FIG. 42D showsvalidation of RESCUEv10 (Rounds 59). FIG. 42E shows validation ofRESCUEv10 (Rounds 61).

FIG. 43 shows NGS analysis of RESCUEv10.

FIG. 44 shows identified mutations that improve specificity.

FIG. 45 shows effects of RESCUE on endogenous targeting (C-flips andT-flips) results.

FIG. 46 shows targeting β-catenin using RESCUE v6 and v9.

FIG. 47 shows new β-catenin secreted Gluc/Cluc reporter.

FIG. 48 shows results of targeting β-catenin by RESCUEv10.

FIG. 49 shows targeting ApoE4 by RESCUEv10.

FIG. 50 shows exemplary mutations in PCSK9 that can be generated usingRESCUE.

FIG. 51 shows results from Gluc knockdown in mammalian cells byCas13b-t1.

FIG. 52 shows results from Gluc knockdown in mammalian cells byCas13b-t2.

FIG. 53 shows results from Gluc knockdown in mammalian cells byCas13b-t3.

FIGS. 54A-54C show loci of Cas13b-t1, Cas13b-t2, and Cas13b-t3.

FIGS. 55A-55C show more details on loci of Cas13b-t1, Cas13b-t2, andCas13b-t3 (SEQ ID NOs:40-45).

FIG. 56 shows alignments of Cas13b-t1, Cas13b-t2, and Cas13b-t3 withother Cas13b orthologs (SEQ ID NO:46-64).

FIG. 57 shows a summary of RESCUE mutations screened.

FIG. 58 is a graph illustrating results of an experiment in which betterbeta catenin mutants were selected.

FIG. 59 shows graphs illustrating results of RESCUE round 12.

FIG. 60 is a schematic illustrating the beta catenin migration assay.

FIG. 61 is a graph showing results of a cell migration assay induced bybeta catenin.

FIG. 62 shows graphs illustrating that specificity mutations eliminateA-I off-targets.

FIG. 63 shows graphs illustrating that targeting Stat1/3 phosphorylationsites reduces signaling.

FIG. 64 shows graphs illustrating that targeting Stat1/3 phosphorylationsites reduces signaling (STAT1 non-treatment (left) and STAT1 IFNγtreatment (right)).

FIG. 65 shows graphs illustrating that targeting Stat1/3 phosphorylationsites reduces signaling, with FIG. 65A showing results for STAT3 IL6activation and FIG. 65B showing results for STAT3 no treatment.

FIG. 66 show graphs illustrating results of RESCUE round 12.

FIG. 67 show graphs illustrating results from a potential RESCUE round13.

FIG. 68 is a graph showing results of a cell migration assay induced bybeta catenin.

FIG. 69 shows a graph illustrating results of comparison of dead andlive tiny orthologs for Gluc knock down.

FIG. 70 shows a graph illustrating of testing function of Cas13b-t1.

FIG. 71 shows a graph illustrating of testing function of Cas13b-t3.

FIG. 72 shows a graph illustrating the guides, non-targeting comparison.

FIGS. 73A-73G: Directed evolution of a ADAR2 deaminase domain forcytidine deamination. (FIG. 73A) Schematic of the directed evolutionapproach, involving rational mutagenesis, yeast screening, and mammaliancell validation of activity. (FIG. 73B) Activity of RESCUE versions 0-16on a cytidine flanked by a 5′ U and a C′ G on a Gluc transcript. Left:Luciferase reporter activity is reported for RESCUEv0-v16. Right:Percent editing levels of RESCUEv0-v16 is reported. (FIG. 73C) Heatmapdepicting the percent editing levels of RESCUEv0-v16 on cytidinesflanked by varying bases on the Gluc transcript. (FIG. 73D) Percentediting of RESCUEv0-v16 on a cytidine flanked by a 5′ U and a C′ G on aGluc transcript at varying levels of the RESCUE plasmid transfected.(FIG. 73E) Editing activity of RESCUEv16 and RESCUEv8 on all possible 16cytidine flanking bases motifs on the Gluc transcript. Guide designswith either a T-flip or a C-flip across from the target cytidine areused. (FIG. 73F) Cytidine deamination by RESCUEv16 is compared toediting with the guide RNA along with either ADAR2dd, full length ADAR2,or no protein. (FIG. 73G) A zoomed in crystal structure view of themutants at the catalytic deamination site with the RNA with the flippedout base also shown.

FIGS. 74A-74G: C to U editing by RESCUE on endogenous and diseaserelevant targets. (FIG. 74A) Editing efficiency of RESCUEv16 on a panelof endogenous genes covering multiple motifs. (FIG. 74B) Heatmapdepicting editing efficiency of RESCUE versions v0-v16 on a panel ofthree endogenous genes. (FIG. 74C) Editing efficiency of RESCUEv16 on aset of synthetic versions of relevant T>C disease mutations. (FIG. 74D)Schematic of multiplexed C to U and A to I editing with pre-crRNA guidearrays. (FIG. 74E) Simultaneous C to U and A to I editing on betacatenin transcripts. (FIG. 74F) Schematic of rational prevention ofoff-target activity at neighboring adenosine sites via introduction ofdisfavored base flips (SEQ ID NO:65-66). (FIG. 74G) Percent editing aton-target C and off-target A sites for Gaussia luciferase (left) andKRAS (right) using rational introduction of disfavored baseflips.

FIGS. 75A-75F: Transcriptome-wide specificity of RESCUEv16. (FIG. 75A)On-target C to U editing and summary of C to U and A to Itranscriptome-wide off targets of RESCUE v16 and B6-REPAIRv1,B12-REPAIRv1, and B12-REPAIRv2. (FIG. 75B) Manhattan plot of RESCUEv16 Ato I and C to U off targets. The on-target C to U edit is highlighted inorange. (FIG. 75C) Schematic of the interactions between ADAR2ddresidues and double stranded RNA substrate with residues used in amutagenesis screen for improving specificity highlighted red (SEQ IDNO:67-68). (FIG. 75D) Luciferase values for C to U activity with atargeting guide (y-axis) and A to I activity with a non-targeting guide(x-axis) shown for RESCUEv16 and 95 RESCUEv16 mutants. Mutantshighlighted in blue have efficient targeted C to U activity, but havelost their residual A to I activity, indicating an improvement in A to Ispecificity. (FIG. 75E) On-target C to U editing and summary of C to Uand A to I transcriptome-wide off targets of RESCUE v16 and topspecificity mutants. (FIG. 75F) Manhattan plot of RESCUEv16S (+S375A) Ato I and C to U off targets (SEQ ID NO:65-66). The on-target C to U editis highlighted in orange.

FIGS. 76A-76H: Phenotypic outcomes directed by C to U RNA editing forcell growth and signaling. (FIG. 76A) Schematic of RNA targeting againstphosphorylated residues of STAT3 to alter associated signaling pathways(SEQ ID NO:69-74). (FIG. 76B) Percent editing at relevant phosphorylatedresidues in STAT3 (left) and STAT1 (right) by RESCUEv16. (FIG. 76C)Inhibition of STAT3 (left) and STAT1 (right) signaling by RNA editing asmeasured by STAT-driven luciferase expression. (FIG. 76D) Schematic ofRNA targeting against phosphorylated residues of CTNNB1 to promotestabilization (SEQ ID NO:75-77). (FIG. 76E) Schematic of beta cateninactivation via editing of phosphorylated residues by RESCUE, resultingin increased cellular growth. (FIG. 76F) Percent editing at relevantphosphorylated residues in CTNNB1 by RESCUEv16. (FIG. 76G) Activation ofCTNNB1 signaling by RNA editing as measured by CTNNB1-driven (TCF/LEF)luciferase expression. (FIG. 76H) Quantitation of cellular growth due toactivation of CTNNB1 signaling by RNA editing.

FIGS. 77A-77B: Screening of inactivating Gluc mutations for generating acytosine deamination luciferase reporter. (FIG. 77A) Luciferase activityof a panel of various Gluc mutants shown to previously have some effecton luciferase activity [cite Gluc paper]. Values represent mean+/−S.E.M(n=3). (FIG. 77B) Luciferase activity of a panel of leucine to prolineGluc mutants. Leucine to proline mutant reporters were focused onbecause they generate a CCN motif site for cytidine deamination (centerC is deaminated). This allows for assaying the effect of all four CCNmotifs on RESCUE deamination activity. Values represent mean+/−S.E.M(n=3).

FIG. 78: Cytidine deamination activity of RESCUEv0-v16 on CCG, ACG, GCG,CCA, and CCU sites in Gluc. Values represent mean+/−S.E.M (n=3).

FIGS. 79A-79B: Cytidine deamination activity of varying amounts ofRESCUEv0-16. (FIG. 78A) Dose response of RESCUEv0-v16 activity asmeasured by restoration of luciferase activity on a UCG site in the Gluctranscript. Values represent mean of three replicates. (FIG. 78B) Doseresponse of RESCUEv0-v16 activity as measured by restoration ofluciferase activity on the T41I site in the CTNNB1 transcript. Valuesrepresent mean of three replicates.

FIG. 80: Percent editing of a UCG site in the Gluc transcript byRESCUEv6-v9 at varying guide and RESCUE plasmid amounts. Valuesrepresent mean+/−S.E.M (n=3).

FIG. 81: Percent editing of Gluc sites with all 16 possible 5′ and 3′base combinations with RESCUEv16 and v8 using guides with either G or Amismatches. Values represent mean+/−S.E.M (n=3).

FIG. 82: Percent editing of RESCUEv1 and RESCUEv2-v8 on a UCG site inthe Gluc transcript with guide RNAs of varying U mismatch positions.RESCUE versions are compared with both RanCas13b and PspCas13b. Valuesrepresent mean+/−S.E.M (n=3). 20/22 denotes 20 mismatch distance forRanCas13b and 22 mismatch distance for PspCas13b.

FIG. 83: Percent editing of RESCUEv16 on a UCG site in the Gluctranscript with 30 bp and 50 bp guides with varying U mismatchpositions. Values represent mean+/−S.E.M (n=3).

FIGS. 84A-84D: Editing rates of various yeast reporters for directedevolution. (FIG. 84A) Percent fluorescence correction of the GFPmutation Y66H by RESCUEv3, v7, and v16 with targeting and non-targetingguides. Fluorescence is measured by performing flow cytometry on 10,000cells. (FIG. 84B) Percent editing correction of the GFP mutation Y66H byRESCUEv3, v7, and v16 with targeting and non-targeting guides. Valuesrepresent mean+/−S.E.M (n=3). (FIG. 84C) Percent editing correction ofthe HIS3 mutation P196L by RESCUEv7, and v16 with targeting andnon-targeting guides. Values represent mean+/−S.E.M (n=3). (FIG. 84D)Percent editing correction of the HIS3 mutation S129P by RESCUEv7, andv16 with targeting and non-targeting guides. Values representmean+/−S.E.M (n=3).

FIGS. 85A-85B: Biochemical deamination activity of ADAR2 deaminasedomain containing RESCUEv2 mutations using recombinant protein. (FIG.85A) Adenosine deamination activity of ADAR2 deaminase domain proteincontaining RESCUEv2 mutations with a 22 bp double-stranded RNA substratecontaining a center adenine mismatched with a cytosine. Reactions wereincubated for varying time points and with and without the deaminasedomain. (FIG. 85B) Cytidine deamination activity of ADAR2 deaminasedomain protein containing RESCUEv2 mutations with a 22 bpdouble-stranded RNA substrate containing a center cytosine mismatchedwith a uridine. Reactions were incubated for varying time points andwith and without the deaminase domain.

FIGS. 86A-86E: Comparison of cytidine deaminase activity of RESCUEv16,full ADAR2 (with RESCUEv16 mutations), ADAR2 deaminase domain (withRESCUEv16 mutations), and without any protein. (FIG. 86A) Percentediting of a site in the Gluc transcript with varying 5′ bases with atargeting guide and RESCUEv16, full ADAR2 (with RESCUEv16 mutations),ADAR2 deaminase domain (with RESCUEv16 mutations), and no protein.Values represent mean+/−S.E.M (n=3). (FIG. 86B) Percent editing of asite in the Gluc transcript with varying 5′ bases with a non-targetingguide and RESCUEv16, full ADAR2 (with RESCUEv16 mutations), ADAR2deaminase domain (with RESCUEv16 mutations), and no protein. Valuesrepresent mean+/−S.E.M (n=3). (FIG. 86C) Editing of a UCG site in theGluc transcript with RESCUEv16 and guide RNAs containing varyingmismatch positions. Values represent mean+/−S.E.M (n=3). (FIG. 86D)Editing of a UCG site in the Gluc transcript with full-length ADAR2(with RESCUEv16 mutations) and guide RNAs containing varying mismatchpositions. Values represent mean+/−S.E.M (n=3). (FIG. 86E) Editing of aUCG site in the Gluc transcript with ADAR2 deaminase domain (withRESCUEv16 mutations) and guide RNAs containing varying mismatchpositions. Values represent mean+/−S.E.M (n=3).

FIGS. 87A-87C: Mismatch position tiling to find optimal editing guidedesign for RESCUEv16 on endogenous target sites. (FIG. 87A) Percentediting of endogenous target sites with varying base motifs withRESCUEv16 and guides with mismatches at position 7, 9, 11, and 13 and Ubase flips. Values represent mean+/−S.E.M (n=3). (FIG. 87B) Percentediting of endogenous target sites with varying base motifs withRESCUEv16 and guides with mismatches at position 7, 9, 11, and 13 and Cbase flips. Values represent mean+/−S.E.M (n=3). (FIG. 87C) Percentediting of endogenous target sites with varying base motifs withRESCUEv16 and guides with mismatches at position 3, 5, 7, 9, and 11 andC and U base flips. Values represent mean+/−S.E.M (n=3).

FIG. 88: Cytidine deamination activity of varying amounts of RESCUEv0-16as measured by percent editing at a KRAS site. Values represent mean ofthree replicates.

FIG. 89: Percent editing of various disease-relevant mutations onsynthetic reporters using RESCUEv16 and guides with varying mismatchpositions. Values represent mean+/−S.E.M (n=3).

FIG. 90: Percent editing at the two ApoE4 cytosines (rs429358 andrs7412) using RESCUEv16 with guides of varying C and U mismatchpositions. Values represent mean+/−S.E.M (n=3).

FIGS. 91A-91C: Specificity of RESCUE versions in the guide duplexwindow. (FIG. 91A) Schematic of editing site of Gaussia luciferasemutant C82R, with the targeted C highlighted in red and nearby adeninebases numbered and highlighted in gray. (FIG. 91B) Percent editing of atnearby adenine bases in Gaussia luciferase mutant C82R with targeting byRESCUEv0, RESCUEv8, and RESCUEv16. (FIG. 91C) Percent editing of adenineto guanosine at adenine 20 by varying amounts of RESCUEv0-v16. Valuesrepresent mean of three replicates.

FIGS. 92A-92D: Adenosine deaminase activity of RESCUEv0-v16 andRESCUEv16S. (FIG. 92A) Luciferase correction via adenosine deaminationof the Gluc transcript by RESCUEv0-v16 and RESCUEv16S using a targetingguide RNA. Values represent mean+/−S.E.M (n=3). (FIG. 92B) Luciferasecorrection via adenosine deamination of the Gluc transcript byRESCUEv0-v16 and RESCUEv16S using a non-targeting guide RNA. Valuesrepresent mean+/−S.E.M (n=3). (FIG. 92C) Percent editing of adenosine toinosine of the Gluc transcript by RESCUEv0-v16 and RESCUEv16S using atargeting guide RNA. Values represent mean+/−S.E.M (n=3). (FIG. 92D)Percent editing of adenosine to inosine of the Gluc transcript byRESCUEv0-v16 and RESCUEv16S using a non-targeting guide RNA. Valuesrepresent mean+/−S.E.M (n=3).

FIGS. 93A-93C: Cytidine deamination activity and off-target activity ona Beta-catenin target site using varying amounts of RESCUEv0-16 andRESCUEv16S. (FIG. 93A) Schematic of editing site of CTNNB1 T41I, withthe targeted C highlighted in red and the nearby off-target adenine basehighlighted in gray. (FIG. 93B) Percent editing of cytosine to uridine(T41A) by varying amounts of RESCUEv0-v16 and RESCUEv16S. Valuesrepresent mean of three replicates. (FIG. 93C) Percent editing ofadenine to guanosine at the off-target adenine by varying amounts ofRESCUEv0-v16 and RESCUEv16S. Values represent mean of three replicates.

FIGS. 94A-94E: On target and off-target editing of RESCUEv16 andRESCUEv16S on endogenous targets. (FIG. 94A) Percent editing ofendogenous target sites with varying base motifs with RESCUEv16 andRESCUEv16S. Values represent mean+/−S.E.M (n=3). (FIG. 94B) Percentediting of at neighboring adenine bases in NRAS I21I with targeting byRESCUEv16 and RESCUEv16S. (FIG. 94C) Percent editing of at neighboringadenine bases in NF2 T21M with targeting by RESCUEv16 and RESCUEv16S.(FIG. 94D) Percent editing of at neighboring adenine bases in RAFT P30Swith targeting by RESCUEv16 and RESCUEv16S. (FIG. 94E) Percent editingof at neighboring adenine bases in CTNNB1 P44S with targeting byRESCUEv16 and RESCUEv16S.

FIGS. 95A-95B: Summary of amino acid changes enabled by RESCUE. (FIG.97A) Amino acid conversions possible using cytidine deamination byRESCUE. (FIG. 97B) Codon table showing all potential amino acid changespossible by RESCUE.

FIG. 96: RESCUE v16S was able to effectively edit endogenous genes.

FIG. 97: RESCUE v16S maintained some A to I activity.

FIG. 98: RESCUE v16 was used to target STAT to reduce INFγ/IL6induction.

FIGS. 99A-99B: RESCUE targeting induces cell growth.

FIG. 100. A schematic showing an example transcript tracking method.

FIG. 101 shows an example system and method of programmable cytidine touridine conversion according to some embodiments herein.

FIG. 102 shows example approaches of correcting mutations and/ortargeting post-translational signaling or catalysis using base editorsaccording to some embodiments herein.

FIGS. 103A-103E Evolution of an ADAR2 deaminase domain for cytidinedeamination in reporter and endogenous transcripts. FIG. 103A. Schematicof RNA targeting of the catalytic residue mutant (C82R) of Gaussialuciferase reporter transcript (SEQ ID NO:712-714). FIG. 103B. Heatmapdepicting the percent editing levels of RESCUEr0-r16 on cytidinesflanked by varying bases on the Gluc transcript. More favorable editingmotifs are shown at the top, while less favorable motifs (5′C) are shownat the bottom. FIG. 103C. Editing activity of RESCUE on all possible 16cytidine flanking bases motifs on the Gluc transcript with U-flip orC-flip guides. FIG. 103D. Activity comparison between RESCUE, ADAR2ddwithout Cas13, full-length ADAR2 without Cas13, or no protein. FIG.103E. Editing efficiency of RESCUE on a panel of endogenous genescovering multiple motifs. The best guide for each site is shown with theentire panel of guides displayed in FIG. 125.

FIGS. 104A-104F Phenotypic outcomes of RESCUE on cell growth andsignaling FIG. 104A. Schematic of b-catenin domains and RESCUE targetingguide (SEQ ID NO:715-717). FIG. 104B. Schematic of b-catenin activationand cell growth via RESCUE editing.

FIG. 104C. Percent editing by RESCUE at relevant positions in the CTNNB1transcript. FIG. 104D. Activation of Wnt/b-catenin signaling by RNAediting as measured by b-catenin-driven (TCF/LEF) luciferase expression.FIG. 104E. Representative microscopy images of RESCUE CTNNB1 targetingand non-targeting guides in HEK293FT cells. FIG. 104F. Quantitation ofcellular growth due to activation of CTNNB1 signaling by RNA editing inHEK293FT cells.

FIGS. 105A-105D RESCUE and REPAIR multiplexing and specificityenhancement via guide engineering. FIG. 105A. Schematic of multiplexed Cto U and A to I editing with pre-crRNA guide arrays. FIG. 105B.Simultaneous C to U and A to I editing on CTNNB1 transcripts. FIG. 105C.Schematic of rational engineering with guanine base flips to preventoff-target activity at neighboring adenosine sites (SEQ ID NO:718-719).FIG. 105D. Percent editing at on-target C and off-target A sites forGaussia luciferase (left) and KRAS (right) using rational introductionof disfavored base flips.

FIGS. 106A-106G Transcriptome-wide specificity of RESCUE. FIG. 106A.On-target C to U editing and summary of C to U and A to Itranscriptome-wide off-targets for RESCUE compared to REPAIR. FIG. 106B.Manhattan plots of RESCUE A to I (left) and C to U (right) off-targets.The on-target C to U edit is highlighted in orange. FIG. 106C. Schematicof the interactions between ADAR2dd residues and double stranded RNAsubstrate with residues used in a mutagenesis screen for improvingspecificity highlighted red (SEQ ID NO:720-721). FIG. 106D. Luciferasevalues for C to U activity with a targeting guide (y-axis) and A to Iactivity with a non-targeting guide (x-axis) shown for RESCUE and 95RESCUE mutants. Mutants highlighted in blue have higher specificity withmaintained C to U activity. RESCUE is highlighted in red. The T375Gmutation that generates REPAIRv2 is shown in orange. FIG. 106E.On-target C to U editing and summary of C to U and A to Itranscriptome-wide off targets of RESCUE, REPAIR, and top specificitymutants. FIG. 106F. Manhattan plot of RESCUE-S(+S375A) A to I (left) andC to U (right) off-targets. The on-target C to U edit is highlighted inorange. FIG. 106G. Representative RNA sequencing reads surrounding theon-target Gluc editing site (blue triangle) for RESCUE (top) andRESCUE-S(bottom). A to I edits are highlighted in red; C to U (T) editsare highlighted in blue; sequencing errors are highlighted in yellow(SEQ ID NO:722-767).

FIGS. 107A-107B Targeted RNA cytidine to uridine editing enables newbase conversions. FIG. 107A Amino acid conversions possible usingcytidine deamination by RESCUE, with corresponding post-translationmodifications and biological activities. FIG. 107B. Schematic of thedirected evolution approach, involving rational mutagenesis, yeastscreening, and mammalian cell validation of activity. Rationalmutagenesis began with targeting residues known to contact the RNAsubstrate, as shown in the schematic at the top, derived from thecrystal structure of ADAR2dd(23). Residues targeted with saturationmutagenesis are highlighted in red. For directed evolution, a HIS3growth reporter was used to enable positive selection of ADAR2dd mutantsin yeast with C to U editing and restoration of the HIS3 gene. Topmutants from each round of yeast evolution are evaluated in mammaliancells for C to U editing activity and then the top mutant is used forthe next round of yeast evolution.

FIG. 108. Comparison of RanCas13b-REPAIR and PspCas13b-REPAIR adenosinedeamination activity in yeast with targeting and non-targeting guides. Ato I correction of the Y66H mutation in EGPF restores GFP fluorescenceand is measured by flow cytometry. As REPAIR with the catalyticallyinactive Cas13b ortholog from Riemerella anatipestifer (dRanCas13b) wasmore effective than REPAIR with the catalytically inactive Cas13bortholog from Prevotella sp. P5-125 (dPspCas13b), we began with adRanCas13b-ADAR2dd fusion for development of RESCUE.

FIGS. 109A-109B Screening of inactivating Gluc mutations for generatinga cytosine deamination luciferase reporter. FIG. 109A. Luciferaseactivity of a panel of various Gluc mutants shown to previously havesome effect on luciferase activity (33). Values represent mean+/−S.E.M(n=3). FIG. 109B. Luciferase activity of a panel of leucine to prolineGluc mutants. Leucine to proline mutant reporters were focused onbecause they generate a CCN motif site for cytidine deamination (centerC is deaminated). This allows for assaying the effect of all four CCNmotifs on RESCUE deamination activity. Values represent mean+/−S.E.M(n=3); WT, wildtype Gluc sequence.

FIG. 110. Cytidine deamination activity of RESCUEr0-r16 on UCG, CCG,ACG, GCG, CCA, and CCU sites in Gluc. Values represent mean+/−S.E.M(n=3).

FIGS. 111A-111C Cytidine deamination activity of varying amounts ofRESCUEr0-r16. FIG. 111A. Dose response of RESCUEr0-r16 activity asmeasured by restoration of luciferase activity on a UCG site in the Gluctranscript. Values represent mean of three replicates. FIG. 111B. Doseresponse of RESCUEr0-r16 activity as measured by C to U editing at a UCGsite in the Gluc transcript. Values represent mean of three replicates.FIG. 111C. Dose response of RESCUEr0-r16 activity as measured byrestoration of luciferase activity on the T41I site in the CTNNB1transcript. Values represent mean of three replicates.

FIG. 112 Percent editing of a UCG site in the Gluc transcript byRESCUEr6-r9 at varying guide and RESCUE plasmid amounts. Valuesrepresent mean+/−S.E.M (n=3).

FIGS. 113A-113E Editing rates of various yeast reporters for directedevolution. FIG. 113A. Percent fluorescence correction of the GFPmutation Y66H by RESCUEr3, r7, and r16 with targeting and non-targetingguides. Fluorescence is measured by performing flow cytometry on 10,000cells. T, targeting guide; NT, non-targeting guide. FIG. 113B. Percentediting correction of the GFP mutation Y66H by RESCUEr3, r7, and r16with targeting and non-targeting guides. T, targeting guide; NT,non-targeting guide. FIG. 113C. Percent editing correction of the HIS3mutation P196L by RESCUEr7, and r16 with targeting and non-targetingguides. T, targeting guide; NT, non-targeting guide. FIG. 113D. Percentediting correction of the HIS3 mutation S129P by RESCUEr7, and r16 withtargeting and non-targeting guides. T, targeting guide; NT,non-targeting guide. FIG. 113E. Percent editing correction of the HIS3mutation S22P by RESCUEr3, r7, and r16 with targeting guides of varyingmismatch distance and non-targeting guide at different hours afterRESCUE induction. NT, non-targeting guide.

FIGS. 114A-114C Percent editing of Gluc sites with all 16 possible 5′and 3′ base combinations with RESCUEr16 and r8 using guides with U, C,G, or A mismatches. FIG. 114A. Percent editing of Gluc sites with all 16possible 5 ÅL and 3 ÅL base combinations with RESCUEr8 using guides witheither U or C mismatches. Values represent mean+/−S.E.M (n=3). FIG.114B. Percent editing of Gluc sites with all 16 possible 5 ÅL and 3 ÅLbase combinations with RESCUEr8 using guides with either G or Amismatches. Values represent mean+/−S.E.M (n=3). FIG. 114C. Percentediting of Gluc sites with all 16 possible 5 ÅL and 3 ÅL basecombinations with RESCUEr16 using guides with either G or A mismatches.Values represent mean+/−S.E.M (n=3).

FIG. 115 Percent editing of RESCUE on a UCG site in the Gluc transcriptwith 30 bp and 50 bp guides with varying U mismatch positions. Valuesrepresent mean+/−S.E.M (n=3).

FIG. 116 Percent editing of RESCUEr1 and RESCUEr3-r8 on a UCG site inthe Gluc transcript with guide RNAs of varying U mismatch positions.Candidate rounds are compared with both RanCas13b and PspCas13b. Valuesrepresent mean+/−S.E.M (n=3). 20/22 denotes 20 mismatch distance forRanCas13b and 22 mismatch distance for PspCas13b. As REPAIR uses afusion of ADAR2dd with dPspCas13b (7), we compared our RESCUE candidaterounds with fusions of PspCas13b and RanCas13b and found them to beequivalently active.

FIGS. 117A-117B View of RESCUE mutations on the crystal structure of theADAR2 deaminase domain. FIG. 117A. The RESCUE mutants are shown in theADAR2 crystal structure (blue) along with the flipped-out cytidinemodeled in purple. FIG. 117B. A zoomed in crystal structure view of themutants at the catalytic deamination site with the RNA with theflipped-out base also shown in purple.

FIGS. 118A-118D Adenosine deaminase activity of RESCUEr0-r16 andRESCUEr16-S. With REPAIR, efficiency of adenosine deamination isdependent on the guide design choice of position relative to the targetadenosine and base flip selection (7), as ADAR2dd prefers to deaminatein mismatch bubbles. The position of the target base within theguide:target dsRNA duplex is particularly important, as Cas13 guides canbe placed anywhere without any sequence restriction and there is a smallwindow of optimal activity for ADAR2dd (7). For RESCUE, we tested allpossible guide base-flips across from the target cytosine, and foundthat the optimal base flips for cytidine deamination were either C or U,with optimal editing of the UCG motif with a 30-nt guide RNA with thetargeting base-flip position 26 base pairs from the 5 ÅL end of thetarget. FIG. 118A. Luciferase correction via adenosine deamination ofthe Gluc transcript by RESCUEr0-r16 and RESCUEr16-S using a targetingguide RNA. Values represent mean+/−S.E.M(n=3). FIG. 118B. Luciferasecorrection via adenosine deamination of the Gluc transcript byRESCUEr0-v16 and RESCUEr16-S using a non-targeting guide RNA. Valuesrepresent mean+/−S.E.M (n=3).

FIG. 118C. Percent editing of adenosine to inosine of the Gluctranscript by RESCUEr0-r16 and RESCUEr16-S using a targeting guide RNA.Values represent mean+/−S.E.M (n=3).

FIG. 118D. Percent editing of adenosine to inosine of the Gluctranscript by RESCUEr0-r16 and RESCUEr16-S using a non-targeting guideRNA. Values represent mean+/−S.E.M (n=3).

FIGS. 119A-119D Evaluation of individual RESCUE mutations added onREPAIR (RESCUEr0) or individual mutations removed from RESCUEr16. FIG.119A. Evaluation of C to U deaminase activity of individual RESCUEmutations added on REPAIR (RESCUEr0) targeting a site on the luciferasetranscript, as measured by luciferase activity restoration. Valuesrepresent mean+/−S.E.M (n=3); WT, RESCUEr0 sequence. FIG. 119B.Evaluation of C to U deaminase activity of individual RESCUE mutationsadded on REPAIR (RESCUEr0) targeting a site on the luciferasetranscript, as measured by percent editing. Values representmean+/−S.E.M (n=3); WT, RESCUEr0 sequence. FIG. 119C. Evaluation of C toU deaminase activity of RESCUEr16 constructs with individual mutationsremoved targeting a site on the luciferase transcript, as measured byluciferase activity restoration. Values represent mean+/−S.E.M (n=3);WT, RESCUEr16 sequence. FIG. 119D. Evaluation of C to U deaminaseactivity of RESCUEr16 constructs with individual mutations removedtargeting a site on the luciferase transcript, as measured by percentediting. Values represent mean+/−S.E.M (n=3); WT, RESCUEr16 sequence.

FIGS. 120A-120D Biochemical deamination activity of ADAR2 deaminasedomain containing RESCUEr0, r2, r8, 13, and r16 mutations usingrecombinant protein. FIG. 120A. Adenosine deamination activity of ADAR2deaminase domain protein containing various candidate mutations with a22 bp double-stranded RNA substrate containing a center adeninemismatched with a cytidine. Reactions were incubated for varying timepoints and with and without the deaminase domain. Values representmean+/−S.E.M (n=3, some error bars occluded by symbols). FIG. 120B.Cytidine deamination activity of ADAR2 deaminase domain proteincontaining various candidate mutations with a 22 bp double-stranded RNAsubstrate containing a center cytidine mismatched with a uridine.Reactions were incubated for varying time points and with and withoutthe deaminase domain. Values represent mean+/−S.E.M (n=3, some errorbars occluded by symbols). FIG. 120C. RESCUE r0 and r16 cytidinedeaminase activity on RNA and DNA substrates, including a cytidine inRNA annealed to complementary DNA (RNA:DNA), a deoxycytidine in DNAannealed to complementary RNA (DNA:RNA), a deoxycytidine in doublestranded DNA (dsDNA), and a deoxycytidine in ssDNA. All double-strandedtemplates contain a cytidine mismatched with a thymidine. Valuesrepresent mean+/−S.E.M (n=3). FIG. 120D. RESCUE r0 and r16 adenosinedeaminase activity on RNA and DNA substrates, including an adenosine inRNA annealed to complementary DNA (RNA:DNA), a deoxyadenosine in DNAannealed to complementary RNA (DNA:RNA), a deoxyadenosine in doublestranded DNA (dsDNA), and a deoxyadenosine in ssDNA. All double-strandedtemplates contain an adenosine mismatched with a cytidine. Valuesrepresent mean+/−S.E.M (n=3).

FIGS. 121A-121D Comparison of cytidine deaminase activity of RESCUEr16,full ADAR2 (with RESCUEr16 mutations), ADAR2 deaminase domain (withRESCUEr16 mutations), and without any protein. FIG. 121A. Adenosinedeaminase activity measured by Cluc activity restoration with atargeting guide and RESCUEr16, full ADAR2 (with RESCUEr16 mutations),ADAR2 deaminase domain (with RESCUEr16 mutations), and no protein.Values represent mean+/−S.E.M (n=3). FIG. 121B. Cytidine deaminaseactivity measured by Gluc activity restoration with a targeting guideand RESCUEr16, full ADAR2 (with RESCUEr16 mutations), ADAR2 deaminasedomain (with RESCUEr16 mutations), and no protein. Values representmean+/−S.E.M (n=3). FIG. 121C. Percent editing of a site in the Gluctranscript with varying 5 ÅL bases with a targeting guide and RESCUEr16,full ADAR2 (with RESCUEr16 mutations), ADAR2 deaminase domain (withRESCUEr16 mutations), and no protein. Values represent mean+/−S.E.M(n=3). FIG. 121D. Percent editing of a site in the Gluc transcript withvarying 5 ÅL bases with a non-targeting guide and RESCUEr16, full ADAR2(with RESCUEr16 mutations), ADAR2 deaminase domain (with RESCUEr16mutations), and no protein. Values represent mean+/−S.E.M (n=3).

FIGS. 122A-122C Comparison of cytidine deaminase activity of RESCUEr16,full ADAR2 (with RESCUEr16 mutations), ADAR2 deaminase domain (withRESCUEr16 mutations), and without any protein. FIG. 122A. Editing of aUCG site in the Gluc transcript with RESCUEr16 and guide RNAs containingvarying mismatch positions. Values represent mean+/−S.E.M (n=3). FIG.122B. Editing of a UCG site in the Gluc transcript with full-lengthADAR2 (with RESCUEr16 mutations) and guide RNAs containing varyingmismatch positions. Values represent mean+/−S.E.M (n=3). FIG. 122C.Editing of a UCG site in the Gluc transcript with ADAR2 deaminase domain(with RESCUEr16 mutations) and guide RNAs containing varying mismatchpositions. Values represent mean+/−S.E.M (n=3).

FIGS. 123A-123C Cytidine deamination activity of RESCUEr16 on a Gluctranscript with guides without direct repeats of 30 or 50 nt in lengthand varying mismatches. FIG. 123A. Cytidine deamination activity ofRESCUEr16 on a Gluc transcript with 30 nt guides without direct repeatsand varying mismatches. Values represent mean+/−S.E.M (n=3). FIG. 123B.Cytidine deamination activity of RESCUEr16 on a Gluc transcript with 50nt guides without direct repeats and varying mismatches. Valuesrepresent mean+/−S.E.M (n=3). FIG. 123C. Cytidine deamination activityof RESCUEr16 on a Gluc transcript with 30 nt guides with direct repeatsand varying mismatches. Values represent mean+/−S.E.M (n=3).

FIGS. 124A-124F Cytidine deamination activity of alternative RNA editingtechnologies with RESCUE mutations incorporated into them. FIG. 124 A.Cytidine deamination activity of MS2-recruited ADAR deaminase domain(24)with RESCUE mutations on a Gluc transcript with 30 nt guides withdifferent base-flips and varying mismatches. Activity is measured byrestoration of luciferase activity. Values represent mean+/−S.E.M (n=3);NT, non-targeting guide. FIG. 124B. Percent Gluc editing byMS2-recruited ADAR deaminase domain(24) with RESCUE mutations on a Gluctranscript with 30 nt guides with different base-flips and varyingmismatches. Values represent mean+/−S.E.M (n=3); NT, non-targetingguide. FIG. 124C. Cytidine deamination activity of associated ADAR guideRNA technology(24) with the deaminase domain containing RESCUE mutationson a Gluc transcript with 30 nt guides with different base-flips andvarying mismatches. Activity is measured by restoration of luciferaseactivity. Values represent mean+/−S.E.M (n=3); NT, non-targeting guide.FIG. 124D. Percent Gluc editing by associated ADAR guide RNAtechnology(24) with the deaminase domain containing RESCUE mutations ona Gluc transcript with 30 nt guides with different base-flips andvarying mismatches. Values represent mean+/−S.E.M (n=3); NT,non-targeting guide. FIG. 124E. Cytidine deamination activity of guideRNA-recruited ADAR deaminase domain(11) with RESCUE mutations on a Gluctranscript with 30 nt guides with different base-flips and varyingmismatches. Activity is measured by restoration of luciferase activity.Values represent mean+/−S.E.M (n=3); NT, non-targeting guide. FIG. 124F.Percent Gluc editing by guide RNA-recruited ADAR deaminase domain(11)with RESCUE mutations on a Gluc transcript with 30 nt guides withdifferent base-flips and varying mismatches. Values representmean+/−S.E.M (n=3); NT, non-targeting guide.

FIGS. 125A-125C Mismatch position tiling to find optimal editing guidedesign for RESCUE on endogenous target sites. FIG. 125A. Percent editingof endogenous target sites with varying base motifs with RESCUE andguides with mismatches at position 7, 9, 11, and 13 and U base flips.Values represent mean+/−S.E.M (n=3). FIG. 125B. Percent editing ofendogenous target sites with varying base motifs with RESCUE and guideswith mismatches at position 7, 9, 11, and 13 and C base flips. Valuesrepresent mean+/−S.E.M (n=3). FIG. 125C. Percent editing of endogenoustarget sites with varying base motifs with RESCUE and guides withmismatches at position 3, 5, 7, 9, and 11 and C and U base flips. Valuesrepresent mean+/−S.E.M (n=3).

FIGS. 126A-126B Cytidine deamination activity of RESCUEr0-r16 asmeasured by percent editing at various endogenous sites and at varyingamounts. FIG. 126A. Heatmap depicting editing efficiency of RESCUEr0-r16on a panel of three endogenous genes. Values represent mean of threereplicates. FIG. 126B. Cytidine deamination activity of varying amountsof RESCUEr0-r16 as measured by percent editing at a KRAS site. Valuesrepresent mean of three replicates.

FIGS. 127A-127B Percent editing of various disease-relevant mutations onsynthetic reporters. FIG. 127A. Editing efficiency of RESCUE on a set ofsynthetic versions of relevant T>C disease mutations with the bestpossible mismatch guide per target site. Editing rates vary between 1%and 42% and conditions are shown sorted by editing efficiency. Allediting rates for synthetic sites are listed in Table 31. Valuesrepresent mean+/−S.E.M (n=3). FIG. 127B. Editing of disease relevantmutations using RESCUE and guides with varying mismatch positions.Values represent mean+/−S.E.M (n=3).

FIG. 128 Percent editing at ApoE4 cytosines with RESCUE with guides ofvarying C and U mismatch positions. ApoE4 variants (rs429358 and rs7412)increase Alzheimer's risk markedly, and are edited by RESCUE at rate upto 5% and 12% on the two sites. All editing rates for synthetic sitesare listed in Table 31. Values represent mean+/−S.E.M (n=3).

FIGS. 129A-129F RNA editing and signal modulation of STAT1/STAT3 byRESCUE. STAT3 and STAT1 are transcription factors that play importantroles in signal transduction via the JAK/STAT pathway and are typicallyactivated via phosphorylation by cytokines and growth factors. Todemonstrate signaling modulation via RNA editing, we altered activationof the STAT pathway by editing phosphorylation sites Y705 and 5727 onSTAT3 and Y701 and S727 on STAT1 with RESCUE over the course of 48hours. FIG. 129A. Schematic of STAT3 domains and RESCUE guides targetingphosphorylated residues of STAT3 to alter associated signaling pathways(SEQ ID NO:768-770). FIG. 129B. Percent editing at relevantphosphorylated residues in STAT3 by RESCUE. In HEK293FT cells, weobserved 6% editing of the S727 STAT3 site and 11% and 7% editing of theY701 and S727 STAT1 sites, respectively. FIG. 129C. Inhibition of STAT3signaling by RNA editing as measured by STAT3-driven luciferaseexpression with guides with different base-flips. These edits resultedin 13% repression of STAT3 and STAT1 activity. FIG. 129D. Percentediting at S727F phosphorylated residue site in STAT1 by RESCUE withguides with varying base-flips. FIG. 129E. Percent editing at Y701Cphosphorylated residue site in STAT1 by RESCUE with guides with varyingbase-flips. FIG. 129F. Inhibition of STAT1 signaling by RNA editing withRESCUE as measured by STATdriven luciferase expression.

FIGS. 130A-130B Modulation of b-catenin phosphorylation and cell growthin HUVEC cells. FIG. 130A. Quantitation of cellular growth due toactivation of CTNNB1 signaling by RNA editing in HUVEC cells. RESCUEstimulated HUVEC growth to levels comparable to levels observed in cellsoverexpressing a b-catenin phosphorylation-null mutant. NT, nontargetingguide. FIG. 130B. Representative microscopy images of RESCUE CTNNB1targeting and non-targeting guides in HUVEC cells.

FIG. 131. RESCUE C to U and A to I activity on transcripts with varying5′ and 3′ flanking bases around the target site with differentC-terminal truncations of dRanCas13b.

FIGS. 132A-132C Specificity of candidate rounds in the guide duplexwindow. FIG. 132A. Schematic of editing site of Gaussia luciferasemutant C82R, with the targeted C highlighted in red and nearby adeninebases numbered and highlighted in gray (SEQ ID NO:771). FIG. 132B.Percent editing of at nearby adenine bases in Gaussia luciferase mutantC82R with targeting by RESCUEr0, RESCUEr8, and RESCUEr16. FIG. 132C.Percent editing of adenine to guanosine at adenine 20 by varying amountsof RESCUEr0-r16. Values represent mean of three replicates.

FIGS. 133A-133D Off-targets nearby target cytidines in single-plex andmultiplex targeting by RESCUE r0, r8, and r16. FIG. 133A. Schematic ofediting site of KRAS transcript, with the targeted C highlighted in redand nearby adenine bases numbered and highlighted in gray (SEQ IDNO:772). FIG. 133B. Percent editing of at nearby adenine bases in KRAStranscript with targeting by RESCUEr0, RESCUEr8, and RESCUEr16. FIG.133C. Schematic of multiplexed editing sites of CTNNB1 transcript, withthe two targeted C sites highlighted in red and nearby adenine basesnumbered and highlighted in gray (SEQ ID NO:773). FIG. 133D. Percentediting of at nearby adenine bases in CTNNB1 transcript with multiplexedtargeting by RESCUEr0, RESCUEr8, and RESCUEr16

FIGS. 134A-134F Characterization of RESCUE and RESCUE-Stranscriptome-wide off-targets. FIG. 134A. Predicted effect oftranscriptome-wide off-target edits by RESCUE with a targeting guideagainst a site on the luciferase transcript. FIG. 134B. Predictedoncogenic effects of transcriptome-wide off-target edits by RESCUE witha targeting guide against a site on the luciferase transcript. FIG.134C. Transcriptome wide off-targets visualized as the number ofoff-target edits per transcript by RESCUE with a targeting guide againsta site on the luciferase transcript. FIG. 134D. Predicted effect oftranscriptome-wide off-target edits by RESCUE-S with a targeting guideagainst a site on the luciferase transcript.

FIG. 134E. Predicted oncogenic effects of transcriptome-wide off-targetedits by RESCUE-S with a targeting guide against a site on theluciferase transcript. FIG. 134F. Transcriptome wide off-targetsvisualized as the number of off-target edits per transcript by RESCUE-Swith a targeting guide against a site on the luciferase transcript.

FIGS. 135A-135C Characterization of 5′ and 3′ flanking bases oftranscriptome-wide off-targets. FIG. 135A. The number of off-targetswith each of all 16 possible 5 ÅL and 3 ÅL flanking bases by RESCUE witha targeting guide against a site on the luciferase transcript. FIG.135B. The number of off-targets with each of all 16 possible 5 ÅL and 3ÅL flanking bases by RESCUE-S with a targeting guide against a site onthe luciferase transcript.

FIG. 135C. Number of significantly differentially expressed transcriptsin conditions with RESCUE constructs targeting luciferase transcripts.

FIGS. 136A-136B Biochemical deamination activity of ADAR2 deaminasedomain containing RESCUEr0, RESCUEr16 and RESCUEr16-S mutations usingrecombinant protein. FIG. 136A. Adenosine deamination activity of ADAR2deaminase domain protein containing various candidate mutations with a22 bp double-stranded RNA substrate containing a center adeninemismatched with a cytosine. Reactions were incubated for varying timepoints and with and without the deaminase domain. Values representmean+/−S.E.M (n=3, some error bars occluded by symbols). FIG. 136B.Cytidine deamination activity of ADAR2 deaminase domain proteincontaining various candidate mutations with a 22 bp double-stranded RNAsubstrate containing a center cytosine mismatched with a uridine.Reactions were incubated for varying time points and with and withoutthe deaminase domain. Values represent mean+/−S.E.M (n=3, some errorbars occluded by symbols).

FIGS. 137A-137D Adenosine deaminase activity of RESCUE and RESCUE-S.FIG. 137A. Luciferase correction via adenosine deamination of the Gluctranscript by RESCUE and RESCUE-S using a targeting guide RNA. Valuesrepresent mean+/−S.E.M (n=3). FIG. 137B. Luciferase correction viaadenosine deamination of the Gluc transcript by RESCUE and RESCUE-Susing a non-targeting guide RNA. Values represent mean+/−S.E.M (n=3).FIG. 137C. Percent editing of adenosine to inosine of the Gluctranscript by RESCUE and RESCUES using a targeting guide RNA. Valuesrepresent mean+/−S.E.M (n=3). FIG. 137D. Percent editing of adenosine toinosine of the Gluc transcript by RESCUE and RESCUES using anon-targeting guide RNA. Values represent mean+/−S.E.M (n=3).

FIGS. 138A-138C Cytidine deamination activity and off-target activity ona b-catenin target site using varying amounts of RESCUEr0-r16 andRESCUEr16-S. FIG. 138A. Schematic of editing site of CTNNB1 T41I, withthe targeted C highlighted in red and the nearby off-target adeninebases highlighted in gray (SEQ ID NO:774). FIG. 138B. Percent editing ofcytosine to uridine (T41A) by varying amounts of RESCUEr0-r16 andRESCUEr16-S. Values represent mean of three replicates. FIG. 138C.Percent editing of adenine to guanosine at the off-target adenine byvarying amounts of RESCUEr0-r16 and RESCUEr16-S. Values represent meanof three replicates.

FIGS. 139A-139C Editing of STAT1 and STAT3 by RESCUE and RESCUE-S. FIG.139A. Schematic of edited sites at STAT3 by C to U and A to I editing(SEQ ID NO:775-778). FIG. 139B. Percent A to I editing at tyrosineresidues in STAT1 and STAT3 by RESCUE and RESCUE-S. Values representmean+/−S.E.M (n=3); NT, non-targeting guide. FIG. 139C. Percent C to Uediting at serine residues in STAT1 and STAT3 by RESCUE and RESCUE-S.Values represent mean+/−S.E.M (n=3); NT, non-targeting guide.

FIGS. 140A-140E On target and off-target editing of RESCUE and RESCUE-Son endogenous targets. FIG. 140A. Percent editing of endogenous targetsites with varying base motifs with RESCUE and RESCUE-S. Valuesrepresent mean+/−S.E.M (n=3). FIG. 140B. Percent editing of atneighboring adenine bases in NRAS I21I with targeting by RESCUE andRESCUE-S. FIG. 140C. Percent editing of at neighboring adenine bases inNF2 T21M with targeting by RESCUE and RESCUE-S. FIG. 140D. Percentediting of at neighboring adenine bases in RAF1 P30S with targeting byRESCUE and RESCUE-S. FIG. 140E. Percent editing of at neighboringadenine bases in CTNNB1 P44S with targeting by RESCUE and RESCUE-S.

FIG. 141 Summary of amino acid changes enabled by RESCUE. Codon tableshowing all potential amino acid changes possible by RESCUE.

The figures herein are for illustrative purposes only and are notnecessarily drawn to scale.

DETAILED DESCRIPTION OF THE EXAMPLE EMBODIMENTS Definitions

Unless defined otherwise, technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this disclosure pertains. Definitions of common termsand techniques in molecular biology may be found in Molecular Cloning: ALaboratory Manual, 2nd edition (1989) (Sambrook, Fritsch, and Maniatis);Molecular Cloning: A Laboratory Manual, 4th edition (2012) (Green andSambrook); Current Protocols in Molecular Biology (1987) (F. M. Ausubelet al. eds.); the series Methods in Enzymology (Academic Press, Inc.):PCR 2: A Practical Approach (1995) (M. J. MacPherson, B. D. Hames, andG. R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow andLane, eds.): Antibodies A Laboratory Manual, 2nd edition 2013 (E. A.Greenfield ed.); Animal Cell Culture (1987) (R.I. Freshney, ed.);Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN0763752223); Kendrew et al. (eds.), The Encyclopedia of MolecularBiology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829);Robert A. Meyers (ed.), Molecular Biology and Biotechnology: aComprehensive Desk Reference, published by VCH Publishers, Inc., 1995(ISBN 9780471185710); Singleton et al., Dictionary of Microbiology andMolecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March,Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed.,John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Janvan Deursen, Transgenic Mouse Methods and Protocols, 2nd edition (2011)

As used herein, the singular forms “a”, “an”, and “the” include bothsingular and plural referents unless the context clearly dictatesotherwise.

The term “optional” or “optionally” means that the subsequent describedevent, circumstance or substituent may or may not occur, and that thedescription includes instances where the event or circumstance occursand instances where it does not.

The recitation of numerical ranges by endpoints includes all numbers andfractions subsumed within the respective ranges, as well as the recitedendpoints.

The terms “about” or “approximately” as used herein when referring to ameasurable value such as a parameter, an amount, a temporal duration,and the like, are meant to encompass variations of and from thespecified value, such as variations of +/−10% or less, +/−5% or less,+/−1% or less, and +1-0.1% or less of and from the specified value,insofar such variations are appropriate to perform in the disclosedinvention. It is to be understood that the value to which the modifier“about” or “approximately” refers is itself also specifically, andpreferably, disclosed.

As used herein, a “biological sample” may contain whole cells and/orlive cells and/or cell debris. The biological sample may contain (or bederived from) a “bodily fluid”. The present invention encompassesembodiments wherein the bodily fluid is selected from amniotic fluid,aqueous humour, vitreous humour, bile, blood serum, breast milk,cerebrospinal fluid, cerumen (earwax), chyle, chyme, endolymph,perilymph, exudates, feces, female ejaculate, gastric acid, gastricjuice, lymph, mucus (including nasal drainage and phlegm), pericardialfluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum (skinoil), semen, sputum, synovial fluid, sweat, tears, urine, vaginalsecretion, vomit and mixtures of one or more thereof. Biological samplesinclude cell cultures, bodily fluids, cell cultures from bodily fluids.Bodily fluids may be obtained from a mammal organism, for example bypuncture, or other collecting or sampling procedures.

The terms “subject,” “individual,” and “patient” are usedinterchangeably herein to refer to a vertebrate, preferably a mammal,more preferably a human. Mammals include, but are not limited to,murines, simians, humans, farm animals, sport animals, and pets.Tissues, cells and their progeny of a biological entity obtained in vivoor cultured in vitro are also encompassed.

Whenever reference is made herein to Cas13, it will be understood that amutated or engineered Cas13 according to the invention as describedherein is meant, unless explicitly indicated otherwise. Wheneverreference is made herein to Cas13, preferably a mutated or engineeredCas13a, Cas13b, Cas13c, or Cas13d according to the invention asdescribed herein is meant, unless explicitly indicated otherwise.Whenever reference is made herein to Cas13, preferably a mutated orengineered Cas13b according to the invention as described herein ismeant, unless explicitly indicated otherwise.

Various embodiments are described hereinafter. It should be noted thatthe specific embodiments are not intended as an exhaustive descriptionor as a limitation to the broader aspects discussed herein. One aspectdescribed in conjunction with a particular embodiment is not necessarilylimited to that embodiment and can be practiced with any otherembodiment(s). Reference throughout this specification to “oneembodiment”, “an embodiment,” “an example embodiment,” means that aparticular feature, structure or characteristic described in connectionwith the embodiment is included in at least one embodiment of thepresent invention. Thus, appearances of the phrases “in one embodiment,”“in an embodiment,” or “an example embodiment” in various placesthroughout this specification are not necessarily all referring to thesame embodiment, but may. Furthermore, the particular features,structures or characteristics may be combined in any suitable manner, aswould be apparent to a person skilled in the art from this disclosure,in one or more embodiments. Furthermore, while some embodimentsdescribed herein include some but not other features included in otherembodiments, combinations of features of different embodiments are meantto be within the scope of the invention. For example, in the appendedclaims, any of the claimed embodiments can be used in any combination.

All publications, published patent documents, and patent applicationscited herein are hereby incorporated by reference to the same extent asthough each individual publication, published patent document, or patentapplication was specifically and individually indicated as beingincorporated by reference.

Overview

In one aspect, embodiments disclosed herein are directed to anengineered CRISPR-Cas protein comprising one or more modified aminoacids. In certain embodiments, the engineered CRISPR-Cas proteinincreases or decreases one or more of PFS recognition/specificity, gRNAbinding, protease activity, polynucleotide binding capability,stability, specificity, target binding, off-target binding, and/orcatalytic activity as compared to a corresponding wild-type CRISPR-Casprotein. In certain embodiments, the CRISPR-Cas protein comprises one ormore HEPN domains, and comprises one or more modified amino acids. Themodified amino acids may interact with a guide RNA that forms a complexwith the CRISPR-Cas protein, and/or are in a HEPN active site, aninter-domain linker domain, a lid domain, a helical domain or a bridgehelix domain of the CRISPR-Cas protein, or a combination thereof. Insome examples, the engineered CRISPR-Cas protein comprising one or moreHEPN domains and further comprising one or more modified amino acids,wherein the amino acids: interact with a guide RNA that forms a complexwith the engineered CRISPR-Cas protein; are in a HEPN active site, aninter-domain linker domain, a lid domain, a helical domain 1, a helicaldomain 2, or a bridge helix domain of the engineered CRISPR-Cas protein;or a combination thereof.

In another aspect, embodiments disclosed herein provide a sub-set ofnewly identified CRISPR-Cas orthologs that are smaller in size thanpreviously discovered CRISPR-Cas orthologs, including furthermodifications to and uses thereof. In particular embodiments, theCRISPR-Cas orthologs are less than about 1000 amino acids and can beoptionally provided as part of a fusion protein.

Engineered nucleotide deaminases are also provided herein. In certainembodiments, the engineered nucleotide deaminases are adenosinedeaminases that can be engineered to comprise cytidine deaminaseactivity. In embodiments, the engineered nucleotide deaminases may befused to a Cas protein, including the CRISPR-Cas proteins disclosedherein.

In another aspect, embodiments disclosed herein include systems and usesfor such modified CRISPR-Cas proteins including, but not limited to,diagnostics, base editing therapeutics and methods of detection. Fusionproteins comprising a CRISPR Cas protein, including those disclosedherein, and nucleotide deaminase may also be used for base editing.Delivery of the proteins and systems disclosed is also provided,including to a variety of cells and via a variety of particles, vesiclesand vectors.

CRISPR-Cas Systems in General

In general, the CRISPR-Cas or CRISPR system refers collectively totranscripts and other elements involved in the expression of ordirecting the activity of CRISPR-associated (“Cas”) genes, includingsequences encoding a Cas gene, a tracr (trans-activating CRISPR)sequence (e.g. tracrRNA or an active partial tracrRNA), a tracr-matesequence (encompassing a “direct repeat” and a tracrRNA-processedpartial direct repeat in the context of an endogenous CRISPR system), aguide sequence (also referred to as a “spacer” in the context of anendogenous CRISPR system), or “RNA(s)” as that term is herein used(e.g., RNA(s) to guide Cas, such as Cas9, e.g. CRISPR RNA andtransactivating (tracr) RNA or a single guide RNA (sgRNA) (chimericRNA)) or other sequences and transcripts from a CRISPR locus. Ingeneral, a CRISPR system is characterized by elements that promote theformation of a CRISPR complex at the site of a target sequence (alsoreferred to as a protospacer in the context of an endogenous CRISPRsystem). When the CRISPR protein is a Class 2 Type VI effector, atracrRNA is not required. In an engineered system of the invention, thedirect repeat may encompass naturally-occurring sequences ornon-naturally-occurring sequences. The direct repeat of the invention isnot limited to naturally occurring lengths and sequences. A directrepeat can be 36nt in length, but a longer or shorter direct repeat canvary. For example, a direct repeat can be 30nt or longer, such as 30-100nt or longer. For example, a direct repeat can be 30 nt, 40nt, 50nt,60nt, 70nt, 70nt, 80nt, 90nt, 100nt or longer in length. In someembodiments, a direct repeat of the invention can include syntheticnucleotide sequences inserted between the 5′ and 3′ ends of naturallyoccurring direct repeats. In certain embodiments, the inserted sequencemay be self-complementary, for example, 20%, 30%, 40%, 50%, 60%, 70%,80%, 90%, or 100% self-complementary. Furthermore, a direct repeat ofthe invention may include insertions of nucleotides such as an aptameror sequences that bind to an adapter protein (for association withfunctional domains). In certain embodiments, one end of a direct repeatcontaining such an insertion is roughly the first half of a short DR andthe end is roughly the second half of the short DR.

The CRISPR-Cas protein (used interchangeably herein with “Cas protein”,“Cas effector”) may include Cas9, Cas 12 (e.g., Cas12a, Cas12b, Cas12c,Cas12d, etc.), Cas13 (e.g., Cas13a, Cas13b (such as Cas13b-t1,Cas13b-t2, Cas13b-t3), Cas13c, Cas13d, etc.), Cas14, CasX, and CasY. Insome embodiments, the CRISPR-Cas protein may be a type VI CRISPR-Casprotein. For example, the Type VI CRISPR-Cas protein may be a Cas13protein. The Cas13 protein may be Cas13a, a Cas13b, a Cas13c, or aCas13d. In some examples, the CRISPR-Cas protein is Cas13a. In someexamples, the CRISPR-Cas protein is Cas13b. In some examples, theCRISPR-Cas protein is Cas13c. In some examples, the CRISPR-Cas proteinis Cas13d.

In some embodiments, an engineered CRISPR-Cas protein comprising one ormore HEPN domains and is less than 1000 amino acids in length. Forexample, the protein may be less than 950, less than 900, less than 850,less than 800, less, or than 750 amino acids in size.

In certain example embodiments, the CRISPR-Cas protein comprises atleast one HEPN domain, including but not limited to the HEPN domainsdescribed herein, HEPN domains known in the art, and domains recognizedto be HEPN domains by comparison to consensus sequence motifs. Severalsuch domains are provided herein. In one non-limiting example, aconsensus sequence can be derived from the sequences of C2c2 or Cas13borthologs provided herein. In certain example embodiments, the effectorprotein comprises a single HEPN domain. In certain other exampleembodiments, the effector protein comprises two HEPN domains.

In one example embodiment, the one or more HEPN domains comprises aRxxxxH motif. The RxxxxH motif sequence can be, without limitation, froma HEPN domain described herein or a HEPN domain known in the art. RxxxxHmotif sequences further include motif sequences created by combiningportions of two or more HEPN domains. As noted, consensus sequences canbe derived from the sequences of the orthologs disclosed in U.S.Provisional Patent Application 62/432,240 entitled “Novel CRISPR Enzymesand Systems,” U.S. Provisional Patent Application 62/471,710 entitled“Novel Type VI CRISPR Orthologs and Systems” filed on Mar. 15, 2017, andU.S. Provisional patent application entitled “Novel Type VI CRISPROrthologs and Systems,” labeled as attorney docket number 47627-05-2133and filed on Apr. 12, 2017.

In an embodiment of the invention, a HEPN domain comprises at least oneRxxxxH motif comprising the sequence of R{N/H/K}X₁X₂X₃H. In anembodiment of the invention, a HEPN domain comprises a RxxxxH motifcomprising the sequence of R{N/H}X₁X₂X₃H. In an embodiment of theinvention, a HEPN domain comprises the sequence of R{N/K}X₁X₂X₃H. Incertain embodiments, X₁ is R, S, D, E, Q, N, G, Y, or H. In certainembodiments, X₂ is I, S, T, V, or L. In certain embodiments, X₃ is L, F,N, Y, V, I, S, D, E, or A.

In the context of formation of a CRISPR complex, “target sequence”refers to a sequence to which a guide sequence is designed to havecomplementarity, where hybridization between a target sequence and aguide sequence promotes the formation of a CRISPR complex. A targetsequence may comprise any polynucleotide, such as DNA or RNApolynucleotides. In some embodiments, a target sequence is located inthe nucleus or cytoplasm of a cell. In some embodiments, direct repeatsmay be identified in silico by searching for repetitive motifs thatfulfill any or all of the following criteria: 1. found in a 2Kb windowof genomic sequence flanking the type II CRISPR locus; 2. span from 20to 50 bp; and 3. interspaced by 20 to 50 bp. In some embodiments, 2 ofthese criteria may be used, for instance 1 and 2, 2 and 3, or 1 and 3.In some embodiments, all 3 criteria may be used.

In embodiments of the invention the terms guide sequence and guide RNA,e.g., RNA capable of guiding CRISPR-Cas effector proteins to a targetlocus, are used interchangeably as in herein cited documents such as WO2014/093622 (PCT/US2013/074667). In general, a guide sequence (or spacersequence) is any polynucleotide sequence having sufficientcomplementarity with a target polynucleotide sequence to hybridize withthe target sequence and direct sequence-specific binding of a CRISPRcomplex to the target sequence. In some embodiments, the degree ofcomplementarity between a guide sequence and its corresponding targetsequence, when optimally aligned using a suitable alignment algorithm,is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%,99%, or more. Optimal alignment may be determined with the use of anysuitable algorithm for aligning sequences, non-limiting example of whichinclude the Smith-Waterman algorithm, the Needleman-Wunsch algorithm,algorithms based on the Burrows-Wheeler Transform (e.g. the BurrowsWheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (NovocraftTechnologies; available at www.novocraft.com), ELAND (Illumina, SanDiego, Calif.), SOAP (available at soap.genomics.org.cn), and Maq(available at maq.sourceforge.net). In some embodiments, a guidesequence (or spacer sequence) is about or more than about 5, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,35, 40, 45, 50, 75, or more nucleotides in length. In some embodiments,a guide sequence is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15,12, or fewer nucleotides in length. Preferably the guide sequence is10-40 nucleotides long, such as 20-30 or 20-40 nucleotides long orlonger, such as 30 nucleotides long or about 30 nucleotides long. Incertain embodiments, the guide sequence is 10-30 nucleotides long, suchas 20-30 or 20-40 nucleotides long or longer, such as 30 nucleotideslong or about 30 nucleotides long for CRISPR-Cas effectors. In certainembodiments, the guide sequence is 10-30 nucleotides long, such as 20-30nucleotides long, such as 30 nucleotides long. The ability of a guidesequence to direct sequence-specific binding of a CRISPR complex to atarget sequence may be assessed by any suitable assay. For example, thecomponents of a CRISPR system sufficient to form a CRISPR complex,including the guide sequence to be tested, may be provided to a hostcell having the corresponding target sequence, such as by transfectionwith vectors encoding the components of the CRISPR sequence, followed byan assessment of preferential cleavage within the target sequence, suchas by Surveyor assay as described herein. Similarly, cleavage of atarget polynucleotide sequence may be evaluated in a test tube byproviding the target sequence, components of a CRISPR complex, includingthe guide sequence to be tested and a control guide sequence differentfrom the test guide sequence, and comparing binding or rate of cleavageat the target sequence between the test and control guide sequencereactions. Other assays are possible, and will occur to those skilled inthe art.

In a classic CRISPR-Cas systems, the degree of complementarity between aguide sequence and its corresponding target sequence can be about ormore than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or 100%;a guide or RNA or crRNA can be about or more than about 5, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,35, 40, 45, 50, 75, or more nucleotides in length; or guide or RNA orcrRNA can be less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, orfewer nucleotides in length; and advantageously tracr RNA is 30 or 50nucleotides in length. However, an aspect of the invention is to reduceoff-target interactions, e.g., reduce the guide interacting with atarget sequence having low complementarity. Indeed, in the examples, itis shown that the invention involves mutations that result in theCRISPR-Cas system being able to distinguish between target andoff-target sequences that have greater than 80% to about 95%complementarity, e.g., 83%-84% or 88-89% or 94-95% complementarity (forinstance, distinguishing between a target having 18 nucleotides from anoff-target of 18 nucleotides having 1, 2 or 3 mismatches). Accordingly,in the context of the present invention the degree of complementaritybetween a guide sequence and its corresponding target sequence isgreater than 94.5% or 95% or 95.5% or 96% or 96.5% or 97% or 97.5% or98% or 98.5% or 99% or 99.5% or 99.9%, or 100%. Off target is less than100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or96.5% or 96% or 95.5% or 95% or 94.5% or 94% or 93% or 92% or 91% or 90%or 89% or 88% or 87% or 86% or 85% or 84% or 83% or 82% or 81% or 80%complementarity between the sequence and the guide, with it advantageousthat off target is 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98%or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% complementaritybetween the sequence and the guide.

In certain embodiments, modulations of cleavage efficiency can beexploited by introduction of mismatches, e.g. 1 or more mismatches, suchas 1 or 2 mismatches between spacer sequence and target sequence,including the position of the mismatch along the spacer/target. The morecentral (i.e. not 3′ or 5′) for instance a double mismatch is, the morecleavage efficiency is affected. Accordingly, by choosing mismatchposition along the spacer, cleavage efficiency can be modulated. Bymeans of example, if less than 100% cleavage of targets is desired (e.g.in a cell population), 1 or more, such as preferably 2 mismatchesbetween spacer and target sequence may be introduced in the spacersequences. The more central along the spacer of the mismatch position,the lower the cleavage percentage.

The methods according to the invention as described herein comprehendinducing one or more nucleotide modifications in a eukaryotic cell (invitro, i.e. in an isolated eukaryotic cell) as herein discussedcomprising delivering to cell a vector as herein discussed. Themutation(s) can include the introduction, deletion, or substitution ofone or more nucleotides at each target sequence of cell(s) via theguide(s) RNA(s) or sgRNA(s). The mutations can include the introduction,deletion, or substitution of 1-75 nucleotides at each target sequence ofsaid cell(s) via the guide(s) RNA(s). The mutations can include theintroduction, deletion, or substitution of 1, 5, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45,50, or 75 nucleotides at each target sequence of said cell(s) via theguide(s) RNA(s). The mutations can include the introduction, deletion,or substitution of 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides ateach target sequence of said cell(s) via the guide(s) RNA(s). Themutations include the introduction, deletion, or substitution of 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,30, 35, 40, 45, 50, or 75 nucleotides at each target sequence of saidcell(s) via the guide(s) RNA(s). The mutations can include theintroduction, deletion, or substitution of 20, 21, 22, 23, 24, 25, 26,27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides at each targetsequence of said cell(s) via the guide(s) RNA(s). The mutations caninclude the introduction, deletion, or substitution of 40, 45, 50, 75,100, 200, 300, 400 or 500 nucleotides at each target sequence of saidcell(s) via the guide(s) RNA(s).

For minimization of toxicity and off-target effect, it will be importantto control the concentration of Cas mRNA or protein and guide RNAdelivered. Optimal concentrations of Cas mRNA or protein and guide RNAcan be determined by testing different concentrations in a cellular ornon-human eukaryote animal model and using deep sequencing the analyzethe extent of modification at potential off-target genomic loci.

Typically, in the context of an endogenous CRISPR system, formation of aCRISPR complex (comprising a guide sequence hybridized to a targetsequence and complexed with one or more Cas proteins) results incleavage in or near (e.g. within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50,or more base pairs from) the target sequence, but may depend on forinstance secondary structure, in particular in the case of RNA targets.In some cases, in the context of an endogenous CRISPR system, formationof a CRISPR complex (comprising a guide sequence hybridized to a targetsequence and complexed with one or more Cas proteins) results incleavage of one or both strands (if applicable) in or near (e.g. within1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or more base pairs from) thetarget sequence.

In particularly preferred embodiments according to the invention, theguide RNA (capable of guiding Cas to a target locus) may comprise (1) aguide sequence capable of hybridizing to a target locus (apolynucleotide target locus, such as an RNA target locus) in theeukaryotic cell; (2) a direct repeat (DR) sequence) which reside in asingle RNA, i.e. an sgRNA (arranged in a 5′ to 3′ orientation) or crRNA.

With respect to general information on CRISPR-Cas Systems, componentsthereof, and delivery of such components, including methods, materials,delivery vehicles, vectors, particles, AAV, and making and usingthereof, including as to amounts and formulations, all useful in thepractice of the instant invention, reference is made to: U.S. Pat. Nos.8,999,641, 8,993,233, 8,945,839, 8,932,814, 8,906,616, 8,895,308,8,889,418, 8,889,356, 8,871,445, 8,865,406, 8,795,965, 8,771,945 and8,697,359; US Patent Publications US 2014-0310830 (U.S. application Ser.No. 14/105,031), US 2014-0287938 A1 (U.S. application Ser. No.14/213,991), US 2014-0273234 A1 (U.S. application Ser. No. 14/293,674),US2014-0273232 A1 (U.S. application Ser. No. 14/290,575), US2014-0273231 (U.S. application Ser. No. 14/259,420), US 2014-0256046 A1(U.S. application Ser. No. 14/226,274), US 2014-0248702 A1 (U.S.application Ser. No. 14/258,458), US 2014-0242700 A1 (U.S. applicationSer. No. 14/222,930), US 2014-0242699 A1 (U.S. application Ser. No.14/183,512), US 2014-0242664 A1 (U.S. application Ser. No. 14/104,990),US 2014-0234972 A1 (U.S. application Ser. No. 14/183,471), US2014-0227787 A1 (U.S. application Ser. No. 14/256,912), US 2014-0189896A1 (U.S. application Ser. No. 14/105,035), US 2014-0186958 (U.S.application Ser. No. 14/105,017), US 2014-0186919 A1 (U.S. applicationSer. No. 14/104,977), US 2014-0186843 A1 (U.S. application Ser. No.14/104,900), US 2014-0179770 A1 (U.S. application Ser. No. 14/104,837)and US 2014-0179006 A1 (U.S. application Ser. No. 14/183,486), US2014-0170753 (U.S. application Ser. No. 14/183,429); European Patents EP2 784 162 B1 and EP 2 771 468 B1; European Patent Applications EP 2 771468 (EP13818570.7), EP 2 764 103 (EP13824232.6), and EP 2 784 162(EP14170383.5); and PCT Patent Publications PCT Patent Publications WO2014/093661 (PCT/US2013/074743), WO 2014/093694 (PCT/US2013/074790), WO2014/093595 (PCT/US2013/074611), WO 2014/093718 (PCT/US2013/074825), WO2014/093709 (PCT/US2013/074812), WO 2014/093622 (PCT/US2013/074667), WO2014/093635 (PCT/US2013/074691), WO 2014/093655 (PCT/US2013/074736), WO2014/093712 (PCT/US2013/074819), WO 2014/093701 (PCT/US2013/074800), WO2014/018423 (PCT/US2013/051418), WO 2014/204723 (PCT/US2014/041790), WO2014/204724 (PCT/US2014/041800), WO 2014/204725 (PCT/US2014/041803), WO2014/204726 (PCT/US2014/041804), WO 2014/204727 (PCT/US2014/041806), WO2014/204728 (PCT/US2014/041808), WO 2014/204729 (PCT/US2014/041809).Reference is also made to U.S. provisional patent applications61/758,468; 61/802,174; 61/806,375; 61/814,263; 61/819,803 and61/828,130, filed on Jan. 30, 2013; Mar. 15, 2013; Mar. 28, 2013; Apr.20, 2013; May 6, 2013 and May 28, 2013 respectively. Reference is alsomade to U.S. provisional patent application 61/836,123, filed on Jun.17, 2013. Reference is additionally made to U.S. provisional patentapplications 61/835,931, 61/835,936, 61/836,127, 61/836,101, 61/836,080and 61/835,973, each filed Jun. 17, 2013. Further reference is made toU.S. provisional patent applications 61/862,468 and 61/862,355 filed onAug. 5, 2013; 61/871,301 filed on Aug. 28, 2013; 61/960,777 filed onSep. 25, 2013 and 61/961,980 filed on Oct. 28, 2013. Reference is yetfurther made to: PCT Patent applications Nos: PCT/US2014/041803,PCT/US2014/041800, PCT/US2014/041809, PCT/US2014/041804 andPCT/US2014/041806, each filed Jun. 10, 2014 6/10/14; PCT/US2014/041808filed Jun. 11, 2014; and PCT/US2014/62558 filed Oct. 28, 2014, and U.S.Provisional Patent Applications Ser. Nos. 61/915,150, 61/915,301,61/915,267 and 61/915,260, each filed Dec. 12, 2013; 61/757,972 and61/768,959, filed on Jan. 29, 2013 and Feb. 25, 2013; 61/835,936,61/836,127, 61/836,101, 61/836,080, 61/835,973, and 61/835,931, filedJun. 17, 2013; 62/010,888 and 62/010,879, both filed Jun. 11, 2014;62/010,329 and 62/010,441, each filed Jun. 10, 2014; 61/939,228 and61/939,242, each filed Feb. 12, 2014; 61/980,012, filed Apr. 15, 2014;62/038,358, filed Aug. 17, 2014; 62/054,490, 62/055,484, 62/055,460 and62/055,487, each filed Sep. 25, 2014; and 62/069,243, filed Oct. 27,2014. Reference is also made to U.S. provisional patent applicationsNos. 62/055,484, 62/055,460, and 62/055,487, filed Sep. 25, 2014; U.S.provisional patent application 61/980,012, filed Apr. 15, 2014; and U.S.provisional patent application 61/939,242 filed Feb. 12, 2014. Referenceis made to PCT application designating, inter alia, the United States,application No. PCT/US14/41806, filed Jun. 10, 2014. Reference is madeto U.S. provisional patent application 61/930,214 filed on Jan. 22,2014. Reference is made to U.S. provisional patent applications61/915,251; 61/915,260 and 61/915,267, each filed on Dec. 12, 2013.Reference is made to US provisional patent application U.S. Ser. No.61/980,012 filed Apr. 15, 2014. Reference is made to PCT applicationdesignating, inter alia, the United States, application No.PCT/US14/41806, filed Jun. 10, 2014. Reference is made to U.S.provisional patent application 61/930,214 filed on Jan. 22, 2014.Reference is made to U.S. provisional patent applications 61/915,251;61/915,260 and 61/915,267, each filed on Dec. 12, 2013.

Mention is also made of U.S. application 62/091,455, filed, 12 Dec.2014, PROTECTED GUIDE RNAS (PGRNAS); U.S. application 62/096,708, 24Dec. 2014, PROTECTED GUIDE RNAS (PGRNAS); U.S. application 62/091,462,12 Dec. 2014, DEAD GUIDES FOR CRISPR TRANSCRIPTION FACTORS; U.S.application 62/096,324, 23 Dec. 2014, DEAD GUIDES FOR CRISPRTRANSCRIPTION FACTORS; U.S. application 62/091,456, 12 Dec. 2014,ESCORTED AND FUNCTIONALIZED GUIDES FOR CRISPR-CAS SYSTEMS; U.S.application 62/091,461, 12 Dec. 2014, DELIVERY, USE AND THERAPEUTICAPPLICATIONS OF THE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR GENOMEEDITING AS TO HEMATOPOETIC STEM CELLS (HSCs); U.S. application62/094,903, 19 Dec. 2014, UNBIASED IDENTIFICATION OF DOUBLE-STRANDBREAKS AND GENOMIC REARRANGEMENT BY GENOME-WISE INSERT CAPTURESEQUENCING; U.S. application 62/096,761, 24 Dec. 2014, ENGINEERING OFSYSTEMS, METHODS AND OPTIMIZED ENZYME AND GUIDE SCAFFOLDS FOR SEQUENCEMANIPULATION; U.S. application 62/098,059, 30 Dec. 2014, RNA-TARGETINGSYSTEM; U.S. application 62/096,656, 24 Dec. 2014, CRISPR HAVING ORASSOCIATED WITH DESTABILIZATION DOMAINS; U.S. application 62/096,697, 24Dec. 2014, CRISPR HAVING OR ASSOCIATED WITH AAV; U.S. application62/098,158, 30 Dec. 2014, ENGINEERED CRISPR COMPLEX INSERTIONALTARGETING SYSTEMS; U.S. application 62/151,052, 22 Apr. 2015, CELLULARTARGETING FOR EXTRACELLULAR EXOSOMAL REPORTING; U.S. application62/054,490, 24 Sep. 2014, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OFTHE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR TARGETING DISORDERS ANDDISEASES USING PARTICLE DELIVERY COMPONENTS; U.S. application62/055,484, 25 Sep. 2014, SYSTEMS, METHODS AND COMPOSITIONS FOR SEQUENCEMANIPULATION WITH OPTIMIZED FUNCTIONAL CRISPR-CAS SYSTEMS; U.S.application 62/087,537, 4 Dec. 2014, SYSTEMS, METHODS AND COMPOSITIONSFOR SEQUENCE MANIPULATION WITH OPTIMIZED FUNCTIONAL CRISPR-CAS SYSTEMS;U.S. application 62/054,651, 24 Sep. 2014, DELIVERY, USE AND THERAPEUTICAPPLICATIONS OF THE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR MODELINGCOMPETITION OF MULTIPLE CANCER MUTATIONS IN VIVO; U.S. application62/067,886, 23 Oct. 2014, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OFTHE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR MODELING COMPETITION OFMULTIPLE CANCER MUTATIONS IN VIVO; U.S. application 62/054,675, 24 Sep.2014, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OF THE CRISPR-CASSYSTEMS AND COMPOSITIONS IN NEURONAL CELLS/TISSUES; U.S. application62/054,528, 24 Sep. 2014, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OFTHE CRISPR-CAS SYSTEMS AND COMPOSITIONS IN IMMUNE DISEASES OR DISORDERS;U.S. application 62/055,454, 25 Sep. 2014, DELIVERY, USE AND THERAPEUTICAPPLICATIONS OF THE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR TARGETINGDISORDERS AND DISEASES USING CELL PENETRATION PEPTIDES (CPP); U.S.application 62/055,460, 25 Sep. 2014, MULTIFUNCTIONAL-CRISPR COMPLEXESAND/OR OPTIMIZED ENZYME LINKED FUNCTIONAL-CRISPR COMPLEXES; U.S.application 62/087,475, 4 Dec. 2014, FUNCTIONAL SCREENING WITH OPTIMIZEDFUNCTIONAL CRISPR-CAS SYSTEMS; U.S. application 62/055,487, 25 Sep.2014, FUNCTIONAL SCREENING WITH OPTIMIZED FUNCTIONAL CRISPR-CAS SYSTEMS;U.S. application 62/087,546, 4 Dec. 2014, MULTIFUNCTIONAL CRISPRCOMPLEXES AND/OR OPTIMIZED ENZYME LINKED FUNCTIONAL-CRISPR COMPLEXES;and U.S. application 62/098,285, 30 Dec. 2014, CRISPR MEDIATED IN VIVOMODELING AND GENETIC SCREENING OF TUMOR GROWTH AND METASTASIS.

Also with respect to general information on CRISPR-Cas Systems, mentionis made of the following (also hereby incorporated herein by reference):

-   -   Multiplex genome engineering using CRISPR/Cas systems. Cong, L.,        Ran, F. A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P.        D., Wu, X., Jiang, W., Marraffini, L. A., & Zhang, F. Science        February 15; 339(6121):819-23 (2013);    -   RNA-guided editing of bacterial genomes using CRISPR-Cas        systems. Jiang W., Bikard D., Cox D., Zhang F, Marraffini L A.        Nat Biotechnol March; 31(3):233-9 (2013);    -   One-Step Generation of Mice Carrying Mutations in Multiple Genes        by CRISPR/Cas-Mediated Genome Engineering. Wang H., Yang H.,        Shivalila C S., Dawlaty M M., Cheng A W., Zhang F., Jaenisch R.        Cell May 9; 153(4):910-8 (2013);    -   Optical control of mammalian endogenous transcription and        epigenetic states. Konermann S, Brigham M D, Trevino A E, Hsu P        D, Heidenreich M, Cong L, Platt R J, Scott D A, Church G M,        Zhang F. Nature. August 22; 500(7463):472-6. doi:        10.1038/Nature12466. Epub 2013 Aug. 23 (2013);    -   Double Nicking by RNA-Guided CRISPR Cas9 for Enhanced Genome        Editing Specificity. Ran, F A., Hsu, P D., Lin, C Y.,        Gootenberg, J S., Konermann, S., Trevino, A E., Scott, D A.,        Inoue, A., Matoba, S., Zhang, Y., & Zhang, F. Cell August 28.        pii: S0092-8674(13)01015-5 (2013-A);    -   DNA targeting specificity of RNA-guided Cas9 nucleases. Hsu, P.,        Scott, D., Weinstein, J., Ran, F A., Konermann, S., Agarwala,        V., Li, Y., Fine, E., Wu, X., Shalem, O., Cradick, T J.,        Marraffini, L A., Bao, G., & Zhang, F. Nat Biotechnol        doi:10.1038/nbt.2647 (2013);    -   Genome engineering using the CRISPR-Cas9 system. Ran, F A., Hsu,        P D., Wright, J., Agarwala, V., Scott, D A., Zhang, F. Nature        Protocols November; 8(11):2281-308 (2013-B); Genome-Scale        CRISPR-Cas9 Knockout Screening in Human Cells. Shalem, O.,        Sanjana, N E., Hartenian, E., Shi, X., Scott, D A., Mikkelson,        T., Heckl, D., Ebert, B L., Root, D E., Doench, J G., Zhang, F.        Science December 12. (2013). [Epub ahead of print];    -   Crystal structure of cas9 in complex with guide RNA and target        DNA. Nishimasu, H., Ran, F A., Hsu, P D., Konermann, S.,        Shehata, S I., Dohmae, N., Ishitani, R., Zhang, F., Nureki, O.        Cell February 27, 156(5):935-49 (2014);    -   Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian        cells. Wu X., Scott D A., Kriz A J., Chiu A C., Hsu P D., Dadon        D B., Cheng A W., Trevino A E., Konermann S., Chen S., Jaenisch        R., Zhang F., Sharp P A. Nat Biotechnol. April 20. doi:        10.1038/nbt.2889 (2014);    -   CRISPR-Cas9 Knockin Mice for Genome Editing and Cancer Modeling.        Platt R J, Chen S, Zhou Y, Yim M J, Swiech L, Kempton H R,        Dahlman J E, Parnas O, Eisenhaure T M, Jovanovic M, Graham D B,        Jhunjhunwala S, Heidenreich M, Xavier R J, Langer R, Anderson D        G, Hacohen N, Regev A, Feng G, Sharp P A, Zhang F. Cell 159(2):        440-455 DOI: 10.1016/j.cell.2014.09.014 (2014);    -   Development and Applications of CRISPR-Cas9 for Genome        Engineering, Hsu P D, Lander E S, Zhang F., Cell. June 5;        157(6):1262-78 (2014).    -   Genetic screens in human cells using the CRISPR/Cas9 system,        Wang T, Wei J J, Sabatini D M, Lander E S., Science. January 3;        343(6166): 80-84. doi:10.1126/science.1246981 (2014);    -   Rational design of highly active sgRNAs for CRISPR-Cas9-mediated        gene inactivation, Doench J G, Hartenian E, Graham D B, Tothova        Z, Hegde M, Smith I, Sullender M, Ebert B L, Xavier R J, Root D        E., (published online 3 Sep. 2014) Nat Biotechnol. December;        32(12):1262-7 (2014);    -   In vivo interrogation of gene function in the mammalian brain        using CRISPR-Cas9, Swiech L, Heidenreich M, Banerjee A, Habib N,        Li Y, Trombetta J, Sur M, Zhang F., (published online 19        Oct. 2014) Nat Biotechnol. January; 33(1):102-6 (2015);    -   Genome-scale transcriptional activation by an engineered        CRISPR-Cas9 complex, Konermann S, Brigham M D, Trevino A E,        Joung J, Abudayyeh O O, Barcena C, Hsu P D, Habib N, Gootenberg        J S, Nishimasu H, Nureki O, Zhang F., Nature. January 29;        517(7536):583-8 (2015).    -   A split-Cas9 architecture for inducible genome editing and        transcription modulation, Zetsche B, Volz S E, Zhang F.,        (published online 2 Feb. 2015) Nat Biotechnol. February;        33(2):139-42 (2015);    -   Genome-wide CRISPR Screen in a Mouse Model of Tumor Growth and        Metastasis, Chen S, Sanjana N E, Zheng K, Shalem O, Lee K, Shi        X, Scott D A, Song J, Pan J Q, Weissleder R, Lee H, Zhang F,        Sharp P A. Cell 160, 1246-1260, Mar. 12, 2015 (multiplex screen        in mouse), and    -   In vivo genome editing using Staphylococcus aureus Cas9, Ran F        A, Cong L, Yan W X, Scott D A, Gootenberg J S, Kriz A J, Zetsche        B, Shalem O, Wu X, Makarova K S, Koonin E V, Sharp P A, Zhang        F., (published online 1 Apr. 2015), Nature. April 9; 520(7546):        186-91 (2015).    -   Shalem et al., “High-throughput functional genomics using        CRISPR-Cas9,” Nature Reviews Genetics 16, 299-311 (May 2015).    -   Xu et al., “Sequence determinants of improved CRISPR sgRNA        design,” Genome Research 25, 1147-1157 (August 2015).    -   Parnas et al., “A Genome-wide CRISPR Screen in Primary Immune        Cells to Dissect Regulatory Networks,” Cell 162, 675-686 (Jul.        30, 2015).    -   Ramanan et al., CRISPR/Cas9 cleavage of viral DNA efficiently        suppresses hepatitis B virus,” Scientific Reports 5:10833. doi:        10.1038/srep10833 (Jun. 2, 2015)    -   Nishimasu et al., Crystal Structure of Staphylococcus aureus        Cas9,” Cell 162, 1113-1126 (Aug. 27, 2015)    -   Zetsche et al. (2015), “Cpf1 is a single RNA-guided endonuclease        of a class 2 CRISPR-Cas system,” Cell 163, 759-771 (Oct.        22, 2015) doi: 10.1016/j.cell.2015.09.038. Epub Sep. 25, 2015    -   Shmakov et al. (2015), “Discovery and Functional        Characterization of Diverse Class 2 CRISPR-Cas Systems,”        Molecular Cell 60, 385-397 (Nov. 5, 2015) doi:        10.1016/j.molcel.2015.10.008. Epub Oct. 22, 2015    -   Dahlman et al., “Orthogonal gene control with a catalytically        active Cas9 nuclease,” Nature Biotechnology 33, 1159-1161        (November, 2015)    -   Gao et al, “Engineered Cpf1 Enzymes with Altered PAM        Specificities,” bioRxiv 091611; doi: dx.doi.org/10.1101/091611        Epub Dec. 4, 2016    -   Smargon et al. (2017), “Cas13b Is a Type VI-B CRISPR-Associated        RNA-Guided RNase Differentially Regulated by Accessory Proteins        Csx27 and Csx28,” Molecular Cell 65, 618-630 (Feb. 16, 2017)        doi: 10.1016/j.molcel.2016.12.023. Epub Jan. 5, 2017        each of which is incorporated herein by reference, may be        considered in the practice of the instant invention, and        discussed briefly below:    -   Cong et al. engineered type II CRISPR-Cas systems for use in        eukaryotic cells based on both Streptococcus thermophilus Cas9        and also Streptococcus pyogenes Cas9 and demonstrated that Cas9        nucleases can be directed by short RNAs to induce precise        cleavage of DNA in human and mouse cells. Their study further        showed that Cas9 as converted into a nicking enzyme can be used        to facilitate homology-directed repair in eukaryotic cells with        minimal mutagenic activity. Additionally, their study        demonstrated that multiple guide sequences can be encoded into a        single CRISPR array to enable simultaneous editing of several at        endogenous genomic loci sites within the mammalian genome,        demonstrating easy programmability and wide applicability of the        RNA-guided nuclease technology. This ability to use RNA to        program sequence specific DNA cleavage in cells defined a new        class of genome engineering tools. These studies further showed        that other CRISPR loci are likely to be transplantable into        mammalian cells and can also mediate mammalian genome cleavage.        Importantly, it can be envisaged that several aspects of the        CRISPR-Cas system can be further improved to increase its        efficiency and versatility.    -   Jiang et al. used the clustered, regularly interspaced, short        palindromic repeats (CRISPR)-associated Cas9 endonuclease        complexed with dual-RNAs to introduce precise mutations in the        genomes of Streptococcus pneumoniae and Escherichia coli. The        approach relied on dual-RNA:Cas9-directed cleavage at the        targeted genomic site to kill unmutated cells and circumvents        the need for selectable markers or counter-selection systems.        The study reported reprogramming dual-RNA:Cas9 specificity by        changing the sequence of short CRISPR RNA (crRNA) to make        single- and multinucleotide changes carried on editing        templates. The study showed that simultaneous use of two crRNAs        enabled multiplex mutagenesis. Furthermore, when the approach        was used in combination with recombineering, in S. pneumoniae,        nearly 100% of cells that were recovered using the described        approach contained the desired mutation, and in E. coli, 65%        that were recovered contained the mutation.    -   Wang et al. (2013) used the CRISPR/Cas system for the one-step        generation of mice carrying mutations in multiple genes which        were traditionally generated in multiple steps by sequential        recombination in embryonic stem cells and/or time-consuming        intercrossing of mice with a single mutation. The CRISPR/Cas        system will greatly accelerate the in vivo study of functionally        redundant genes and of epistatic gene interactions.    -   Konermann et al. (2013) addressed the need in the art for        versatile and robust technologies that enable optical and        chemical modulation of DNA-binding domains based CRISPR Cas9        enzyme and also Transcriptional Activator Like Effectors    -   Ran et al. (2013-A) described an approach that combined a Cas9        nickase mutant with paired guide RNAs to introduce targeted        double-strand breaks. This addresses the issue of the Cas9        nuclease from the microbial CRISPR-Cas system being targeted to        specific genomic loci by a guide sequence, which can tolerate        certain mismatches to the DNA target and thereby promote        undesired off-target mutagenesis. Because individual nicks in        the genome are repaired with high fidelity, simultaneous nicking        via appropriately offset guide RNAs is required for        double-stranded breaks and extends the number of specifically        recognized bases for target cleavage. The authors demonstrated        that using paired nicking can reduce off-target activity by 50-        to 1,500-fold in cell lines and to facilitate gene knockout in        mouse zygotes without sacrificing on-target cleavage efficiency.        This versatile strategy enables a wide variety of genome editing        applications that require high specificity.    -   Hsu et al. (2013) characterized SpCas9 targeting specificity in        human cells to inform the selection of target sites and avoid        off-target effects. The study evaluated >700 guide RNA variants        and SpCas9-induced indel mutation levels at >100 predicted        genomic off-target loci in 293T and 293FT cells. The authors        that SpCas9 tolerates mismatches between guide RNA and target        DNA at different positions in a sequence-dependent manner,        sensitive to the number, position and distribution of        mismatches. The authors further showed that SpCas9-mediated        cleavage is unaffected by DNA methylation and that the dosage of        SpCas9 and sgRNA can be titrated to minimize off-target        modification. Additionally, to facilitate mammalian genome        engineering applications, the authors reported providing a        web-based software tool to guide the selection and validation of        target sequences as well as off-target analyses.    -   Ran et al. (2013-B) described a set of tools for Cas9-mediated        genome editing via non-homologous end joining (NHEJ) or        homology-directed repair (HDR) in mammalian cells, as well as        generation of modified cell lines for downstream functional        studies. To minimize off-target cleavage, the authors further        described a double-nicking strategy using the Cas9 nickase        mutant with paired guide RNAs. The protocol provided by the        authors experimentally derived guidelines for the selection of        target sites, evaluation of cleavage efficiency and analysis of        off-target activity. The studies showed that beginning with        target design, gene modifications can be achieved within as        little as 1-2 weeks, and modified clonal cell lines can be        derived within 2-3 weeks.    -   Shalem et al. described a new way to interrogate gene function        on a genome-wide scale. Their studies showed that delivery of a        genome-scale CRISPR-Cas9 knockout (GeCKO) library targeted        18,080 genes with 64,751 unique guide sequences enabled both        negative and positive selection screening in human cells. First,        the authors showed use of the GeCKO library to identify genes        essential for cell viability in cancer and pluripotent stem        cells. Next, in a melanoma model, the authors screened for genes        whose loss is involved in resistance to vemurafenib, a        therapeutic that inhibits mutant protein kinase BRAF. Their        studies showed that the highest-ranking candidates included        previously validated genes NF1 and MED12 as well as novel hits        NF2, CUL3, TADA2B, and TADA1. The authors observed a high level        of consistency between independent guide RNAs targeting the same        gene and a high rate of hit confirmation, and thus demonstrated        the promise of genome-scale screening with Cas9.    -   Nishimasu et al. reported the crystal structure of Streptococcus        pyogenes Cas9 in complex with sgRNA and its target DNA at 2.5 A°        resolution. The structure revealed a bilobed architecture        composed of target recognition and nuclease lobes, accommodating        the sgRNA:DNA heteroduplex in a positively charged groove at        their interface. Whereas the recognition lobe is essential for        binding sgRNA and DNA, the nuclease lobe contains the HNH and        RuvC nuclease domains, which are properly positioned for        cleavage of the complementary and non-complementary strands of        the target DNA, respectively. The nuclease lobe also contains a        carboxyl-terminal domain responsible for the interaction with        the protospacer adjacent motif (PAM). This high-resolution        structure and accompanying functional analyses have revealed the        molecular mechanism of RNA-guided DNA targeting by Cas9, thus        paving the way for the rational design of new, versatile        genome-editing technologies.    -   Wu et al. mapped genome-wide binding sites of a catalytically        inactive Cas9 (dCas9) from Streptococcus pyogenes loaded with        single guide RNAs (sgRNAs) in mouse embryonic stem cells        (mESCs). The authors showed that each of the four sgRNAs tested        targets dCas9 to between tens and thousands of genomic sites,        frequently characterized by a 5-nucleotide seed region in the        sgRNA and an NGG protospacer adjacent motif (PAM). Chromatin        inaccessibility decreases dCas9 binding to other sites with        matching seed sequences; thus 70% of off-target sites are        associated with genes. The authors showed that targeted        sequencing of 295 dCas9 binding sites in mESCs transfected with        catalytically active Cas9 identified only one site mutated above        background levels. The authors proposed a two-state model for        Cas9 binding and cleavage, in which a seed match triggers        binding but extensive pairing with target DNA is required for        cleavage.    -   Platt et al. established a Cre-dependent Cas9 knockin mouse. The        authors demonstrated in vivo as well as ex vivo genome editing        using adeno-associated virus (AAV)-, lentivirus-, or        particle-mediated delivery of guide RNA in neurons, immune        cells, and endothelial cells.    -   Hsu et al. (2014) is a review article that discusses generally        CRISPR-Cas9 history from yogurt to genome editing, including        genetic screening of cells.    -   Wang et al. (2014) relates to a pooled, loss-of-function genetic        screening approach suitable for both positive and negative        selection that uses a genome-scale lentiviral single guide RNA        (sgRNA) library.    -   Doench et al. created a pool of sgRNAs, tiling across all        possible target sites of a panel of six endogenous mouse and        three endogenous human genes and quantitatively assessed their        ability to produce null alleles of their target gene by antibody        staining and flow cytometry. The authors showed that        optimization of the PAM improved activity and also provided an        on-line tool for designing sgRNAs.    -   Swiech et al. demonstrate that AAV-mediated SpCas9 genome        editing can enable reverse genetic studies of gene function in        the brain.    -   Konermann et al. (2015) discusses the ability to attach multiple        effector domains, e.g., transcriptional activator, functional        and epigenomic regulators at appropriate positions on the guide        such as stem or tetraloop with and without linkers.    -   Zetsche et al. demonstrates that the Cas9 enzyme can be split        into two and hence the assembly of Cas9 for activation can be        controlled.    -   Chen et al. relates to multiplex screening by demonstrating that        a genome-wide in vivo CRISPR-Cas9 screen in mice reveals genes        regulating lung metastasis.    -   Ran et al. (2015) relates to SaCas9 and its ability to edit        genomes and demonstrates that one cannot extrapolate from        biochemical assays. Shalem et al. (2015) described ways in which        catalytically inactive Cas9 (dCas9) fusions are used to        synthetically repress (CRISPRi) or activate (CRISPRa)        expression, showing. advances using Cas9 for genome-scale        screens, including arrayed and pooled screens, knockout        approaches that inactivate genomic loci and strategies that        modulate transcriptional activity. End Edits    -   Shalem et al. (2015) described ways in which catalytically        inactive Cas9 (dCas9) fusions are used to synthetically repress        (CRISPRi) or activate (CRISPRa) expression, showing. advances        using Cas9 for genome-scale screens, including arrayed and        pooled screens, knockout approaches that inactivate genomic loci        and strategies that modulate transcriptional activity.    -   Xu et al. (2015) assessed the DNA sequence features that        contribute to single guide RNA (sgRNA) efficiency in        CRISPR-based screens. The authors explored efficiency of        CRISPR/Cas9 knockout and nucleotide preference at the cleavage        site. The authors also found that the sequence preference for        CRISPRi/a is substantially different from that for CRISPR/Cas9        knockout.    -   Parnas et al. (2015) introduced genome-wide pooled CRISPR-Cas9        libraries into dendritic cells (DCs) to identify genes that        control the induction of tumor necrosis factor (Tnf) by        bacterial lipopolysaccharide (LPS). Known regulators of Tlr4        signaling and previously unknown candidates were identified and        classified into three functional modules with distinct effects        on the canonical responses to LPS.    -   Ramanan et al (2015) demonstrated cleavage of viral episomal DNA        (cccDNA) in infected cells. The HBV genome exists in the nuclei        of infected hepatocytes as a 3.2 kb double-stranded episomal DNA        species called covalently closed circular DNA (cccDNA), which is        a key component in the HBV life cycle whose replication is not        inhibited by current therapies. The authors showed that sgRNAs        specifically targeting highly conserved regions of HBV robustly        suppresses viral replication and depleted cccDNA.    -   Nishimasu et al. (2015) reported the crystal structures of        SaCas9 in complex with a single guide RNA (sgRNA) and its        double-stranded DNA targets, containing the 5′-TTGAAT-3′ PAM and        the 5′-TTGGGT-3′ PAM. A structural comparison of SaCas9 with        SpCas9 highlighted both structural conservation and divergence,        explaining their distinct PAM specificities and orthologous        sgRNA recognition.

Also, “Dimeric CRISPR RNA-guided FokI nucleases for highly specificgenome editing”, Shengdar Q. Tsai, Nicolas Wyvekens, Cyd Khayter,Jennifer A. Foden, Vishal Thapar, Deepak Reyon, Mathew J. Goodwin,Martin J. Aryee, J. Keith Joung Nature Biotechnology 32(6): 569-77(2014), relates to dimeric RNA-guided FokI Nucleases that recognizeextended sequences and can edit endogenous genes with high efficienciesin human cells. In addition, mention is made of PCT applicationPCT/US14/70057, Attorney Reference 47627.99.2060 and BI-2013/107entitled “DELIVERY, USE AND THERAPEUTIC APPLICATIONS OF THE CRISPR-CASSYSTEMS AND COMPOSITIONS FOR TARGETING DISORDERS AND DISEASES USINGPARTICLE DELIVERY COMPONENTS (claiming priority from one or more or allof US provisional patent applications: 62/054,490, filed Sep. 24, 2014;62/010,441, filed Jun. 10, 2014; and 61/915,118, 61/915,215 and61/915,148, each filed on Dec. 12, 2013) (“the Particle Delivery PCT”),incorporated herein by reference, with respect to a method of preparingan sgRNA-and-Cas9 protein containing particle comprising admixing amixture comprising an sgRNA and Cas9 protein (and optionally HDRtemplate) with a mixture comprising or consisting essentially of orconsisting of surfactant, phospholipid, biodegradable polymer,lipoprotein and alcohol; and particles from such a process. For example,wherein Cas9 protein and sgRNA were mixed together at a suitable, e.g.,3:1 to 1:3 or 2:1 to 1:2 or 1:1 molar ratio, at a suitable temperature,e.g., 15-30C, e.g., 20-25C, e.g., room temperature, for a suitable time,e.g., 15-45, such as 30 minutes, advantageously in sterile, nucleasefree buffer, e.g., 1×PBS. Separately, particle components such as orcomprising: a surfactant, e.g., cationic lipid, e.g.,1,2-dioleoyl-3-trimethylammonium-propane (DOTAP); phospholipid, e.g.,dimyristoylphosphatidylcholine (DMPC); biodegradable polymer, such as anethylene-glycol polymer or PEG, and a lipoprotein, such as a low-densitylipoprotein, e.g., cholesterol were dissolved in an alcohol,advantageously a C₁₋₆ alkyl alcohol, such as methanol, ethanol,isopropanol, e.g., 100% ethanol. The two solutions were mixed togetherto form particles containing the Cas9-sgRNA complexes. Accordingly,sgRNA may be pre-complexed with the Cas9 protein, before formulating theentire complex in a particle. Formulations may be made with a differentmolar ratio of different components known to promote delivery of nucleicacids into cells (e.g. 1,2-dioleoyl-3-trimethylammonium-propane (DOTAP),1,2-ditetradecanoyl-sn-glycero-3-phosphocholine (DMPC), polyethyleneglycol (PEG), and cholesterol) For example DOTAP:DMPC:PEG:CholesterolMolar Ratios may be DOTAP 100, DMPC 0, PEG 0, Cholesterol 0; or DOTAP90, DMPC 0, PEG 10, Cholesterol 0; or DOTAP 90, DMPC 0, PEG 5,Cholesterol 5. DOTAP 100, DMPC 0, PEG 0, Cholesterol 0. That applicationaccordingly comprehends admixing sgRNA, Cas9 protein and components thatform a particle; as well as particles from such admixing. Aspects of theinstant invention can involve particles; for example, particles using aprocess analogous to that of the Particle Delivery PCT, e.g., byadmixing a mixture comprising crRNA and/or CRISPR-Cas as in the instantinvention and components that form a particle, e.g., as in the ParticleDelivery PCT, to form a particle and particles from such admixing (or,of course, other particles involving crRNA and/or CRISPR-Cas as in theinstant invention).

Guide Sequences

In embodiments of the invention the terms guide sequence and guide RNAand crRNA are used interchangeably as in foregoing cited documents suchas WO 2014/093622 (PCT/US2013/074667). In general, a guide sequence isany polynucleotide sequence having sufficient complementarity with atarget polynucleotide sequence to hybridize with the target sequence anddirect sequence-specific binding of a CRISPR complex to the targetsequence. In some embodiments, the degree of complementarity between aguide sequence and its corresponding target sequence, when optimallyaligned using a suitable alignment algorithm, is about or more thanabout 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimalalignment may be determined with the use of any suitable algorithm foraligning sequences, non-limiting example of which include theSmith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithmsbased on the Burrows-Wheeler Transform (e.g., the Burrows WheelerAligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies;available at www.novocraft.com), ELAND (Illumina, San Diego, Calif.),SOAP (available at soap.genomics.org.cn), and Maq (available atmaq.sourceforge.net). In some embodiments, a guide sequence is about ormore than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotidesin length. In some embodiments, a guide sequence is less than about 75,50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length.Preferably the guide sequence is 10-30 nucleotides long, such as 30nucleotides long. The ability of a guide sequence to directsequence-specific binding of a CRISPR complex to a target sequence maybe assessed by any suitable assay. For example, the components of aCRISPR system sufficient to form a CRISPR complex, including the guidesequence to be tested, may be provided to a host cell having thecorresponding target sequence, such as by transfection with vectorsencoding the components of the CRISPR sequence, followed by anassessment of preferential cleavage within the target sequence, such asby Surveyor assay as described herein. Similarly, cleavage of a targetpolynucleotide sequence may be evaluated in a test tube by providing thetarget sequence, components of a CRISPR complex, including the guidesequence to be tested and a control guide sequence different from thetest guide sequence, and comparing binding or rate of cleavage at thetarget sequence between the test and control guide sequence reactions.Other assays are possible, and will occur to those skilled in the art. Aguide sequence may be selected to target any target sequence. In someembodiments, the target sequence is a sequence within a genome of acell. Exemplary target sequences include those that are unique in thetarget genome.

In general, and throughout this specification, the term “vector” refersto a nucleic acid molecule capable of transporting another nucleic acidto which it has been linked. Vectors include, but are not limited to,nucleic acid molecules that are single-stranded, double-stranded, orpartially double-stranded; nucleic acid molecules that comprise one ormore free ends, no free ends (e.g., circular); nucleic acid moleculesthat comprise DNA, RNA, or both; and other varieties of polynucleotidesknown in the art. One type of vector is a “plasmid,” which refers to acircular double stranded DNA loop into which additional DNA segments canbe inserted, such as by standard molecular cloning techniques. Anothertype of vector is a viral vector, wherein virally-derived DNA or RNAsequences are present in the vector for packaging into a virus (e.g.,retroviruses, replication defective retroviruses, adenoviruses,replication defective adenoviruses, and adeno-associated viruses). Viralvectors also include polynucleotides carried by a virus for transfectioninto a host cell. Certain vectors are capable of autonomous replicationin a host cell into which they are introduced (e.g., bacterial vectorshaving a bacterial origin of replication and episomal mammalianvectors). Other vectors (e.g., non-episomal mammalian vectors) areintegrated into the genome of a host cell upon introduction into thehost cell, and thereby are replicated along with the host genome.Moreover, certain vectors are capable of directing the expression ofgenes to which they are operatively-linked. Such vectors are referred toherein as “expression vectors.” Vectors for and that result inexpression in a eukaryotic cell can be referred to herein as “eukaryoticexpression vectors.” Common expression vectors of utility in recombinantDNA techniques are often in the form of plasmids.

Recombinant expression vectors can comprise a nucleic acid of theinvention in a form suitable for expression of the nucleic acid in ahost cell, which means that the recombinant expression vectors includeone or more regulatory elements, which may be selected on the basis ofthe host cells to be used for expression, that is operatively-linked tothe nucleic acid sequence to be expressed. Within a recombinantexpression vector, “operably linked” is intended to mean that thenucleotide sequence of interest is linked to the regulatory element(s)in a manner that allows for expression of the nucleotide sequence (e.g.,in an in vitro transcription/translation system or in a host cell whenthe vector is introduced into the host cell).

The term “regulatory element” is intended to include promoters,enhancers, internal ribosomal entry sites (IRES), and other expressioncontrol elements (e.g., transcription termination signals, such aspolyadenylation signals and poly-U sequences). Such regulatory elementsare described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY:METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990).Regulatory elements include those that direct constitutive expression ofa nucleotide sequence in many types of host cell and those that directexpression of the nucleotide sequence only in certain host cells (e.g.,tissue-specific regulatory sequences). A tissue-specific promoter maydirect expression primarily in a desired tissue of interest, such asmuscle, neuron, bone, skin, blood, specific organs (e.g., liver,pancreas), or particular cell types (e.g., lymphocytes). Regulatoryelements may also direct expression in a temporal-dependent manner, suchas in a cell-cycle dependent or developmental stage-dependent manner,which may or may not also be tissue or cell-type specific. In someembodiments, a vector comprises one or more pol III promoter (e.g., 1,2, 3, 4, 5, or more pol III promoters), one or more pol II promoters(e.g., 1, 2, 3, 4, 5, or more pol II promoters), one or more pol Ipromoters (e.g., 1, 2, 3, 4, 5, or more pol I promoters), orcombinations thereof. Examples of pol III promoters include, but are notlimited to, U6 and H1 promoters. Examples of pol II promoters include,but are not limited to, the retroviral Rous sarcoma virus (RSV) LTRpromoter (optionally with the RSV enhancer), the cytomegalovirus (CMV)promoter (optionally with the CMV enhancer) [see, e.g., Boshart et al,Cell, 41:521-530 (1985)], the SV40 promoter, the dihydrofolate reductasepromoter, the β-actin promoter, the phosphoglycerol kinase (PGK)promoter, and the EF1α promoter. Also encompassed by the term“regulatory element” are enhancer elements, such as WPRE; CMV enhancers;the R-U5′ segment in LTR of HTLV-I (Mol. Cell. Biol., Vol. 8(1), p.466-472, 1988); SV40 enhancer; and the intron sequence between exons 2and 3 of rabbit β-globin (Proc. Natl. Acad. Sci. USA., Vol. 78(3), p.1527-31, 1981). It will be appreciated by those skilled in the art thatthe design of the expression vector can depend on such factors as thechoice of the host cell to be transformed, the level of expressiondesired, etc. A vector can be introduced into host cells to therebyproduce transcripts, proteins, or peptides, including fusion proteins orpeptides, encoded by nucleic acids as described herein (e.g., clusteredregularly interspersed short palindromic repeats (CRISPR) transcripts,proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.).

Advantageous vectors include lentiviruses and adeno-associated viruses,and types of such vectors can also be selected for targeting particulartypes of cells.

As used herein, the term “crRNA” or “guide RNA” or “single guide RNA” or“sgRNA” or “one or more nucleic acid components” of a Type VI CRISPR-Caslocus effector protein comprises any polynucleotide sequence havingsufficient complementarity with a target nucleic acid sequence tohybridize with the target nucleic acid sequence and directsequence-specific binding of a RNA-targeting complex to the target RNAsequence.

In certain embodiments, the CRISPR system as provided herein can makeuse of a crRNA or analogous polynucleotide comprising a guide sequence,wherein the polynucleotide is an RNA, a DNA or a mixture of RNA and DNA,and/or wherein the polynucleotide comprises one or more nucleotideanalogs. The sequence can comprise any structure, including but notlimited to a structure of a native crRNA, such as a bulge, a hairpin ora stem loop structure. In certain embodiments, the polynucleotidecomprising the guide sequence forms a duplex with a secondpolynucleotide sequence which can be an RNA or a DNA sequence.

In certain embodiments, guides of the invention comprise non-naturallyoccurring nucleic acids and/or non-naturally occurring nucleotidesand/or nucleotide analogs, and/or chemically modifications.Non-naturally occurring nucleic acids can include, for example, mixturesof naturally and non-naturally occurring nucleotides. Non-naturallyoccurring nucleotides and/or nucleotide analogs may be modified at theribose, phosphate, and/or base moiety. In an embodiment of theinvention, a guide nucleic acid comprises ribonucleotides andnon-ribonucleotides. In one such embodiment, a guide comprises one ormore ribonucleotides and one or more deoxyribonucleotides. In anembodiment of the invention, the guide comprises one or morenon-naturally occurring nucleotide or nucleotide analog such as anucleotide with phosphorothioate linkage, boranophosphate linkage, alocked nucleic acid (LNA) nucleotides comprising a methylene bridgebetween the 2′ and 4′ carbons of the ribose ring, or bridged nucleicacids (BNA). Other examples of modified nucleotides include 2′-O-methylanalogs, 2′-deoxy analogs, 2-thiouridine analogs, N6-methyladenosineanalogs, or 2′-fluoro analogs. Further examples of modified basesinclude, but are not limited to, 2-aminopurine, 5-bromo-uridine,pseudouridine (Ψ), N1-methylpseudouridine (melΨ), 5-methoxyuridine(5moU), inosine, 7-methylguanosine. Examples of guide RNA chemicalmodifications include, without limitation, incorporation of 2′-O-methyl(M), 2′-O-methyl 3′phosphorothioate (MS), S-constrained ethyl (cEt), or2′-O-methyl 3′thioPACE (MSP) at one or more terminal nucleotides. Suchchemically modified guide RNAs can comprise increased stability andincreased activity as compared to unmodified guide RNAs, thoughon-target vs. off-target specificity is not predictable. (See, Hendel,2015, Nat Biotechnol. 33(9):985-9, doi: 10.1038/nbt.3290, publishedonline 29 Jun. 2015; Allerson et al., J. Med. Chem. 2005, 48:901-904;Bramsen et al., Front. Genet., 2012, 3:154; Deng et al., PNAS, 2015,112:11870-11875; Sharma et al., MedChemComm., 2014, 5:1454-1471; Li etal., Nature Biomedical Engineering, 2017, 1, 0066DOI:10.1038/s41551-017-0066).

In some embodiments, the 5′ and/or 3′ end of a guide RNA is modified bya variety of functional moieties including fluorescent dyes,polyethylene glycol, cholesterol, proteins, or detection tags. (SeeKelly et al., 2016, J. Biotech. 233:74-83). In certain embodiments, aguide comprises ribonucleotides in a region that binds to a target DNAand one or more deoxyribonucleotides and/or nucleotide analogs in aregion that binds to Cas9, Cpf1, or C2c1. In an embodiment of theinvention, deoxyribonucleotides and/or nucleotide analogs areincorporated in engineered guide structures, such as, withoutlimitation, 5′ and/or 3′ end, stem-loop regions, and the seed region. Incertain embodiments, the modification is not in the 5′-handle of thestem-loop regions. Chemical modification in the 5′-handle of thestem-loop region of a guide may abolish its function (see Li, et al.,Nature Biomedical Engineering, 2017, 1:0066). In certain embodiments, atleast 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75nucleotides of a guide is chemically modified. In some embodiments, 3-5nucleotides at either the 3′ or the 5′ end of a guide is chemicallymodified. In some embodiments, only minor modifications are introducedin the seed region, such as 2′-F modifications. In some embodiments,2′-F modification is introduced at the 3′ end of a guide. In certainembodiments, three to five nucleotides at the 5′ and/or the 3′ end ofthe guide are chemically modified with 2′-O-methyl (M),2′-O-methyl-3′-phosphorothioate (MS), S-constrained ethyl(cEt), or2′-O-methyl-3′-thioPACE (MSP). Such modification can enhance genomeediting efficiency (see Hendel et al., Nat. Biotechnol. (2015) 33(9):985-989). In certain embodiments, all of the phosphodiester bonds of aguide are substituted with phosphorothioates (PS) for enhancing levelsof gene disruption. In certain embodiments, more than five nucleotidesat the 5′ and/or the 3′ end of the guide are chemically modified with2′-O-Me, 2′-F or S-constrained ethyl(cEt). Such chemically modifiedguide can mediate enhanced levels of gene disruption (see Ragdarm etal., 0215, PNAS, E7110-E7111). In an embodiment of the invention, aguide is modified to comprise a chemical moiety at its 3′ and/or 5′ end.Such moieties include, but are not limited to amine, azide, alkyne,thio, dibenzocyclooctyne (DBCO), or Rhodamine. In certain embodiment,the chemical moiety is conjugated to the guide by a linker, such as analkyl chain. In certain embodiments, the chemical moiety of the modifiedguide can be used to attach the guide to another molecule, such as DNA,RNA, protein, or nanoparticles. Such chemically modified guide can beused to identify or enrich cells generically edited by a CRISPR system(see Lee et al., eLife, 2017, 6:e25312, DOI:10.7554)

In some embodiments, the modification to the guide is a chemicalmodification, an insertion, a deletion or a split. In some embodiments,the chemical modification includes, but is not limited to, incorporationof 2′-O-methyl (M) analogs, 2′-deoxy analogs, 2-thiouridine analogs,N6-methyladenosine analogs, 2′-fluoro analogs, 2-aminopurine,5-bromo-uridine, pseudouridine (Ψ), N1-methylpseudouridine (melΨ),5-methoxyuridine (5moU), inosine, 7-methylguanosine,2′-O-methyl-3′-phosphorothioate (MS), S-constrained ethyl(cEt),phosphorothioate (PS), or 2′-O-methyl-3′-thioPACE (MSP). In someembodiments, the guide comprises one or more of phosphorothioatemodifications. In certain embodiments, at least 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 25 nucleotides of theguide are chemically modified. In certain embodiments, one or morenucleotides in the seed region are chemically modified. In certainembodiments, one or more nucleotides in the 3′-terminus are chemicallymodified. In certain embodiments, none of the nucleotides in the5′-handle is chemically modified. In some embodiments, the chemicalmodification in the seed region is a minor modification, such asincorporation of a 2′-fluoro analog. In a specific embodiment, onenucleotide of the seed region is replaced with a 2′-fluoro analog. Insome embodiments, 5 or 10 nucleotides in the 3′-terminus are chemicallymodified. Such chemical modifications at the 3′-terminus of the Cpf1CrRNA improve gene cutting efficiency (see Li, et al., Nature BiomedicalEngineering, 2017, 1:0066). In a specific embodiment, 5 nucleotides inthe 3′-terminus are replaced with 2′-fluoro analogues. In a specificembodiment, 10 nucleotides in the 3′-terminus are replaced with2′-fluoro analogues. In a specific embodiment, 5 nucleotides in the3′-terminus are replaced with 2′-O-methyl (M) analogs.

In some embodiments, the loop of the 5′-handle of the guide is modified.In some embodiments, the loop of the 5′-handle of the guide is modifiedto have a deletion, an insertion, a split, or chemical modifications. Incertain embodiments, the loop comprises 3, 4, or 5 nucleotides. Incertain embodiments, the loop comprises the sequence of UCUU, UUUU,UAUU, or UGUU.

In one aspect, the guide comprises portions that are chemically linkedor conjugated via a non-phosphodiester bond. In one aspect, the guidecomprises, in non-limiting examples, direct repeat sequence portion anda targeting sequence portion that are chemically linked or conjugatedvia a non-nucleotide loop. In some embodiments, the portions are joinedvia a non-phosphodiester covalent linker. Examples of the covalentlinker include but are not limited to a chemical moiety selected fromthe group consisting of carbamates, ethers, esters, amides, imines,amidines, aminotrizines, hydrozone, disulfides, thioethers, thioesters,phosphorothioates, phosphorodithioates, sulfonamides, sulfonates,fulfones, sulfoxides, ureas, thioureas, hydrazide, oxime, triazole,photolabile linkages, C—C bond forming groups such as Diels-Aldercyclo-addition pairs or ring-closing metathesis pairs, and Michaelreaction pairs.

In some embodiments, portions of the guide are first synthesized usingthe standard phosphoramidite synthetic protocol (Herdewijn, P., ed.,Methods in Molecular Biology Col 288, Oligonucleotide Synthesis: Methodsand Applications, Humana Press, New Jersey (2012)). In some embodiments,the non-targeting guide portions can be functionalized to contain anappropriate functional group for ligation using the standard protocolknown in the art (Hermanson, G. T., Bioconjugate Techniques, AcademicPress (2013)). Examples of functional groups include, but are notlimited to, hydroxyl, amine, carboxylic acid, carboxylic acid halide,carboxylic acid active ester, aldehyde, carbonyl, chlorocarbonyl,imidazolylcarbonyl, hydrozide, semicarbazide, thio semicarbazide, thiol,maleimide, haloalkyl, sufonyl, ally, propargyl, diene, alkyne, andazide. Once a non-targeting portions of a guide is functionalized, acovalent chemical bond or linkage can be formed between the twooligonucleotides. Examples of chemical bonds include, but are notlimited to, those based on carbamates, ethers, esters, amides, imines,amidines, aminotrizines, hydrozone, disulfides, thioethers, thioesters,phosphorothioates, phosphorodithioates, sulfonamides, sulfonates,fulfones, sulfoxides, ureas, thioureas, hydrazide, oxime, triazole,photolabile linkages, C—C bond forming groups such as Diels-Aldercyclo-addition pairs or ring-closing metathesis pairs, and Michaelreaction pairs.

In some embodiments, one or more portions of a guide can be chemicallysynthesized. In some embodiments, the chemical synthesis uses automated,solid-phase oligonucleotide synthesis machines with 2′-acetoxyethylorthoester (2′-ACE) (Scaringe et al., J. Am. Chem. Soc. (1998) 120:11820-11821; Scaringe, Methods Enzymol. (2000) 317: 3-18) or2′-thionocarbamate (2′-TC) chemistry (Dellinger et al., J. Am. Chem.Soc. (2011) 133: 11540-11546; Hendel et al., Nat. Biotechnol. (2015)33:985-989).

In some embodiments, the guide portions can be covalently linked usingvarious bioconjugation reactions, loops, bridges, and non-nucleotidelinks via modifications of sugar, internucleotide phosphodiester bonds,purine and pyrimidine residues. Sletten et al., Angew. Chem. Int. Ed.(2009) 48:6974-6998; Manoharan, M. Curr. Opin. Chem. Biol. (2004) 8:570-9; Behlke et al., Oligonucleotides (2008) 18: 305-19; Watts, et al.,Drug. Discov. Today (2008) 13: 842-55; Shukla, et al., ChemMedChem(2010) 5: 328-49.

In some embodiments, the guide portions can be covalently linked usingclick chemistry. In some embodiments, guide portions can be covalentlylinked using a triazole linker. In some embodiments, guide portions canbe covalently linked using Huisgen 1,3-dipolar cycloaddition reactioninvolving an alkyne and azide to yield a highly stable triazole linker(He et al., ChemBioChem (2015) 17: 1809-1812; WO 2016/186745). In someembodiments, guide portions are covalently linked by ligating a5′-hexyne portion and a 3′-azide portion. In some embodiments, either orboth of the 5′-hexyne guide portion and a 3′-azide guide portion can beprotected with 2′-acetoxyethl orthoester (2′-ACE) group, which can besubsequently removed using Dharmacon protocol (Scaringe et al., J. Am.Chem. Soc. (1998) 120: 11820-11821; Scaringe, Methods Enzymol. (2000)317: 3-18).

In some embodiments, guide portions can be covalently linked via alinker (e.g., a non-nucleotide loop) that comprises a moiety such asspacers, attachments, bioconjugates, chromophores, reporter groups, dyelabeled RNAs, and non-naturally occurring nucleotide analogues. Morespecifically, suitable spacers for purposes of this invention include,but are not limited to, polyethers (e.g., polyethylene glycols,polyalcohols, polypropylene glycol or mixtures of ethylene and propyleneglycols), polyamines group (e.g., spennine, spermidine and polymericderivatives thereof), polyesters (e.g., poly(ethyl acrylate)),polyphosphodiesters, alkylenes, and combinations thereof. Suitableattachments include any moiety that can be added to the linker to addadditional properties to the linker, such as but not limited to,fluorescent labels. Suitable bioconjugates include, but are not limitedto, peptides, glycosides, lipids, cholesterol, phospholipids, diacylglycerols and dialkyl glycerols, fatty acids, hydrocarbons, enzymesubstrates, steroids, biotin, digoxigenin, carbohydrates,polysaccharides. Suitable chromophores, reporter groups, and dye-labeledRNAs include, but are not limited to, fluorescent dyes such asfluorescein and rhodamine, chemiluminescent, electrochemiluminescent,and bioluminescent marker compounds. The design of example linkersconjugating two RNA components are also described in WO 2004/015075.

The linker (e.g., a non-nucleotide loop) can be of any length. In someembodiments, the linker has a length equivalent to about 0-16nucleotides. In some embodiments, the linker has a length equivalent toabout 0-8 nucleotides. In some embodiments, the linker has a lengthequivalent to about 0-4 nucleotides. In some embodiments, the linker hasa length equivalent to about 2 nucleotides. Example linker design isalso described in WO2011/008730.

In some embodiments, the degree of complementarity, when optimallyaligned using a suitable alignment algorithm, is about or more thanabout 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more. Optimalalignment may be determined with the use of any suitable algorithm foraligning sequences, non-limiting example of which include theSmith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithmsbased on the Burrows-Wheeler Transform (e.g., the Burrows WheelerAligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies;available at www.novocraft.com), ELAND (Illumina, San Diego, Calif.),SOAP (available at soap.genomics.org.cn), and Maq (available atmaq.sourceforge.net). The ability of a guide sequence (within aRNA-targeting guide RNA or crRNA) to direct sequence-specific binding ofa nucleic acid-targeting complex to a target nucleic acid sequence maybe assessed by any suitable assay. For example, the components of aRNA-targeting CRISPR-Cas system sufficient to form a nucleicacid-targeting complex, including the guide sequence to be tested, maybe provided to a host cell having the corresponding target nucleic acidsequence, such as by transfection with vectors encoding the componentsof the nucleic acid-targeting complex, followed by an assessment ofpreferential targeting (e.g., cleavage) within the target nucleic acidsequence, such as by Surveyor assay as described herein. Similarly,cleavage of a target nucleic acid sequence may be evaluated in a testtube by providing the target nucleic acid sequence, components of anucleic acid-targeting complex, including the guide sequence to betested and a control guide sequence different from the test guidesequence, and comparing binding or rate of cleavage at the targetsequence between the test and control guide sequence reactions. Otherassays are possible, and will occur to those skilled in the art. A guidesequence, and hence a RNA-targeting guide RNA or crRNA may be selectedto target any target nucleic acid sequence. The target sequence may beDNA. The target sequence may be any RNA sequence. In some embodiments,the target sequence may be a sequence within a RNA molecule selectedfrom the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomalRNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interferingRNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA),double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA(lncRNA), and small cytoplasmatic RNA (scRNA). In some preferredembodiments, the target sequence may be a sequence within a RNA moleculeselected from the group consisting of mRNA, pre-mRNA, and rRNA. In somepreferred embodiments, the target sequence may be a sequence within aRNA molecule selected from the group consisting of ncRNA, and lncRNA. Insome more preferred embodiments, the target sequence may be a sequencewithin an mRNA molecule or a pre-mRNA molecule.

In some embodiments, a RNA-targeting guide RNA or crRNA is selected toreduce the degree secondary structure within the RNA-targeting guide RNAor crRNA. In some embodiments, about or less than about 75%, 50%, 40%,30%, 25%, 20%, 15%, 10%, 5%, 1%, or fewer of the nucleotides of theRNA-targeting guide RNA participate in self-complementary base pairingwhen optimally folded. Optimal folding may be determined by any suitablepolynucleotide folding algorithm. Some programs are based on calculatingthe minimal Gibbs free energy. An example of one such algorithm ismFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981),133-148). Another example folding algorithm is the online webserverRNAfold, developed at Institute for Theoretical Chemistry at theUniversity of Vienna, using the centroid structure prediction algorithm(see e.g., A. R. Gruber et al., 2008, Cell 106(1): 23-24; and P A Carrand G M Church, 2009, Nature Biotechnology 27(12): 1151-62).

In some embodiments, a nucleic acid-targeting guide is designed orselected to modulate intermolecular interactions among guide molecules,such as among stem-loop regions of different guide molecules. It will beappreciated that nucleotides within a guide that base-pair to form astem-loop are also capable of base-pairing to form an intermolecularduplex with a second guide and that such an intermolecular duplex wouldnot have a secondary structure compatible with CRISPR complex formation.Accordingly, is useful to select or design DR sequences in order tomodulate stem-loop formation and CRISPR complex formation. In someembodiments, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%,10%, 5%, 1%, or fewer of nucleic acid-targeting guides are inintermolecular duplexes. It will be appreciated that stem-loop variationwill often be within limits imposed by DR-CRISPR effector interactions.One way to modulate stem-loop formation or change the equilibriumbetween stem-loop and intermolecular duplex is to vary nucleotide pairsin the stem of the stem-loop of a DR. For example, in one embodiment, aG-C pair is replaced by an A-U or U-A pair. In another embodiment, anA-U pair is substituted for a G-C or a C-G pair. In another embodiment,a naturally occurring nucleotide is replaced by a nucleotide analog.Another way to modulate stem-loop formation or change the equilibriumbetween stem-loop and intermolecular duplex is to modify the loop of thestem-loop of a DR. Without be bound by theory, the loop can be viewed asan intervening sequence flanked by two sequences that are complementaryto each other. When that intervening sequence is not self-complementary,its effect will be to destabilize intermolecular duplex formation. Thesame principle applies when guides are multiplexed: while the targetingsequences may differ, it may be advantageous to modify the stem-loopregion in the DRs of the different guides. Moreover, when guides aremultiplexed, the relative activities of the different guides can bemodulated by balancing the activity of each individual guide. In certainembodiments, the equilibrium between intermolecular stem-loops vs.intermolecular duplexes is determined. The determination may be made byphysical or biochemical means and can be in the presence or absence of aCRISPR effector.

In certain embodiments, a guide RNA or crRNA may comprise, consistessentially of, or consist of a direct repeat (DR) sequence and a guidesequence or spacer sequence. In certain embodiments, the guide RNA orcrRNA may comprise, consist essentially of, or consist of a directrepeat sequence fused or linked to a guide sequence or spacer sequence.In certain embodiments, the direct repeat sequence may be locatedupstream (i.e., 5′) from the guide sequence or spacer sequence. In otherembodiments, the direct repeat sequence may be located downstream (i.e.,3′) from the guide sequence or spacer sequence. In other embodiments,multiple DRs (such as dual DRs) may be present.

In certain embodiments, the crRNA comprises a stem loop, preferably asingle stem loop. In certain embodiments, the direct repeat sequenceforms a stem loop, preferably a single stem loop.

In certain embodiments, the spacer length of the guide RNA is from 15 to35 nt. In certain embodiments, the spacer length of the guide RNA is atleast 15 nucleotides. In certain embodiments, the spacer length is from15 to 17 nt, e.g., 15, 16, or 17 nt, from 17 to 20 nt, e.g., 17, 18, 19,or 20 nt, from 20 to 24 nt, e.g., 20, 21, 22, 23, or 24 nt, from 23 to25 nt, e.g., 23, 24, or 25 nt, from 24 to 27 nt, e.g., 24, 25, 26, or 27nt, from 27-30 nt, e.g., 27, 28, 29, or 30 nt, from 30-35 nt, e.g., 30,31, 32, 33, 34, or 35 nt, or 35 nt or longer.

The “tracrRNA” sequence or analogous terms includes any polynucleotidesequence that has sufficient complementarity with a crRNA sequence tohybridize. In general, degree of complementarity is with reference tothe optimal alignment of the sca sequence and tracr sequence, along thelength of the shorter of the two sequences. Optimal alignment may bedetermined by any suitable alignment algorithm, and may further accountfor secondary structures, such as self-complementarity within either thesca sequence or tracr sequence. In some embodiments, the degree ofcomplementarity between the tracr sequence and sca sequence along thelength of the shorter of the two when optimally aligned is about or morethan about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, orhigher. In certain embodiments, the tracrRNA may not be required.Indeed, the CRISPR-Cas effector protein from Bergeyella zoohelcum andorthologs thereof do not require a tracrRNA to ensure cleavage of an RNAtarget.

In further detail, the assay is as follows for a RNA target, providedthat a PAM sequence is required to direct recognition. Two E. colistrains are used in this assay. One carries a plasmid that encodes theendogenous effector protein locus from the bacterial strain. The otherstrain carries an empty plasmid (e.g. pACYC184, control strain). Allpossible 7 or 8 bp PAM sequences are presented on an antibioticresistance plasmid (pUC19 with ampicillin resistance gene). The PAM islocated next to the sequence of proto-spacer 1 (the RNA target to thefirst spacer in the endogenous effector protein locus). Two PAMlibraries were cloned. One has a 8 random bp 5′ of the proto-spacer(e.g. total of 65536 different PAM sequences=complexity). The otherlibrary has 7 random bp 3′ of the proto-spacer (e.g. total complexity is16384 different PAMs). Both libraries were cloned to have in average 500plasmids per possible PAM. Test strain and control strain weretransformed with 5′PAM and 3′PAM library in separate transformations andtransformed cells were plated separately on ampicillin plates.Recognition and subsequent cutting/interference with the plasmid rendersa cell vulnerable to ampicillin and prevents growth. Approximately 12 hafter transformation, all colonies formed by the test and controlstrains where harvested and plasmid RNA was isolated. Plasmid RNA wasused as template for PCR amplification and subsequent deep sequencing.Representation of all PAMs in the untransformed libraries showed theexpected representation of PAMs in transformed cells. Representation ofall PAMs found in control strains showed the actual representation.Representation of all PAMs in test strain showed which PAMs are notrecognized by the enzyme and comparison to the control strain allowsextracting the sequence of the depleted PAM. In particular embodiments,the cleavage, such as the RNA cleavage is not PAM dependent. Indeed, forthe Bergeyella zoohelcum Cas13b effector protein and its orthologs, RNAtarget cleavage appears to be PAM independent, and hence the Table 1Cas13b of the invention may act in a PAM independent fashion.

For minimization of toxicity and off-target effect, it will be importantto control the concentration of RNA-targeting guide RNA delivered.Optimal concentrations of nucleic acid-targeting guide RNA can bedetermined by testing different concentrations in a cellular ornon-human eukaryote animal model and using deep sequencing the analyzethe extent of modification at potential off-target genomic loci. Theconcentration that gives the highest level of on-target modificationwhile minimizing the level of off-target modification should be chosenfor in vivo delivery. The RNA-targeting system is derived advantageouslyfrom a CRISPR-Cas system. In some embodiments, one or more elements of aRNA-targeting system is derived from a particular organism comprising anendogenous RNA-targeting system of a Tables 1-4 Cas13 effector proteinsystem as herein-discussed.

Dead Guide Sequence

In one aspect, the invention provides guide sequences which are modifiedin a manner which allows for formation of the CRISPR Cas complex andsuccessful binding to the target, while at the same time, not eitherallowing for or not allowing for successful nuclease activity (i.e.without nuclease activity/without indel activity). For matters ofexplanation such modified guide sequences are referred to as “deadguides” or “dead guide sequences”. These dead guides or dead guidesequences can be thought of as catalytically inactive orconformationally inactive with regard to nuclease activity. Indeed, deadguide sequences may not sufficiently engage in productive base pairingwith respect to the ability to promote catalytic activity or todistinguish on-target and off-target binding activity. Briefly, theassay involves synthesizing a CRISPR target RNA and guide RNAscomprising mismatches with the target RNA, combining these with the RNAtargeting enzyme and analyzing cleavage based on gels based on thepresence of bands generated by cleavage products, and quantifyingcleavage based upon relative band intensities.

Hence, in a related aspect, the invention provides a non-naturallyoccurring or engineered composition RNA targeting CRISPR-Cas systemcomprising a functional RNA targeting enzyme as described herein, andguide RNA (gRNA) or crRNA wherein the gRNA or crRNA comprises a deadguide sequence whereby the gRNA is capable of hybridizing to a targetsequence such that the RNA targeting CRISPR-Cas system is directed to agenomic locus of interest in a cell without detectable RNA cleavageactivity of a non-mutant RNA targeting enzyme of the system. It is to beunderstood that any of the gRNAs or crRNAs according to the invention asdescribed herein elsewhere may be used as dead gRNAs/crRNAs comprising adead guide sequence.

The ability of a dead guide sequence to direct sequence-specific bindingof a CRISPR complex to an RNA target sequence may be assessed by anysuitable assay. For example, the components of a CRISPR-Cas systemsufficient to form a CRISPR-Cas complex, including the dead guidesequence to be tested, may be provided to a host cell having thecorresponding target sequence, such as by transfection with vectorsencoding the components of the system, followed by an assessment ofpreferential cleavage within the target sequence.

As explained further herein, several structural parameters allow for aproper framework to arrive at such dead guides. Dead guide sequences canbe typically shorter than respective guide sequences which result inactive RNA cleavage. In particular embodiments, dead guides are 5%, 10%,20%, 30%, 40%, 50%, shorter than respective guides directed to the same.

As explained below and known in the art, one aspect of gRNA or crRNA-RNAtargeting specificity is the direct repeat sequence, which is to beappropriately linked to such guides. In particular, this implies thatthe direct repeat sequences are designed dependent on the origin of theRNA targeting enzyme. Structural data available for validated dead guidesequences may be used for designing CRISPR-Cas specific equivalents.Structural similarity between, e.g., the orthologous nuclease domainsHEPN of two or more CRISPR-Cas effector proteins may be used to transferdesign equivalent dead guides. Thus, the dead guide herein may beappropriately modified in length and sequence to reflect such CRISPR-Casspecific equivalents, allowing for formation of the CRISPR-Cas complexand successful binding to the target RNA, while at the same time, notallowing for successful nuclease activity.

Dead guides allow one to use gRNA or crRNA as a means for genetargeting, without the consequence of nuclease activity, while at thesame time providing directed means for activation or repression. GuideRNA or crRNA comprising a dead guide may be modified to further includeelements in a manner which allow for activation or repression of geneactivity, in particular protein adaptors (e.g. aptamers) as describedherein elsewhere allowing for functional placement of gene effectors(e.g. activators or repressors of gene activity). One example is theincorporation of aptamers, as explained herein and in the state of theart. By engineering the gRNA or crRNA comprising a dead guide toincorporate protein-interacting aptamers (Konermann et al.,“Genome-scale transcription activation by an engineered CRISPR-Cas9complex,” doi:10.1038/nature14136, incorporated herein by reference),one may assemble multiple distinct effector domains. Such may be modeledafter natural processes.

Cas13 in General

The instant invention provides particular Cas13 effectors, nucleicacids, systems, vectors, and methods of use. The features and functionsof Cas13 may also be the features and functions of other CRISPR-Casproteins described herein.

As used herein, the terms Cas13b-s1 accessory protein, Cas13b-s1protein, Cas13b-s1, Csx27, and Csx27 protein are used interchangeablyand the terms Cas13b-s2 accessory protein, Cas13b-s2 protein, Cas13b-S2,Csx28, and Csx28 protein are used interchangeably.

In particular embodiments, the wildtype Cas13 effector protein has RNAbinding and cleaving function.

In particular embodiments, the (wild type or mutated) Cas13 effectorprotein may have RNA and/or DNA cleaving function, preferably RNAcleaving function. In these embodiments, methods may be provided basedon the effector proteins provided herein which comprehend inducing oneor more mutations in a eukaryotic cell (in vitro, i.e. in an isolatedeukaryotic cell) as herein discussed comprising delivering to cell avector as herein discussed. The mutation(s) can include theintroduction, deletion, or substitution of one or more nucleotides ateach target sequence of cell(s) via the guide(s) RNA(s) or sgRNA(s) orcrRNA(s). The mutations can include the introduction, deletion, orsubstitution of 1-75 nucleotides at each target sequence of said cell(s)via the guide(s) RNA(s) or sgRNA(s) or crRNA(s). The mutations caninclude the introduction, deletion, or substitution of 1, 5, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,35, 40, 45, 50, or 75 nucleotides at each target sequence of saidcell(s) via the guide(s) RNA(s) or sgRNA(s) or crRNA(s). The mutationscan include the introduction, deletion, or substitution of 5, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,30, 35, 40, 45, 50, or 75 nucleotides at each target sequence of saidcell(s) via the guide(s) RNA(s) or sgRNA(s) or crRNA(s). The mutationsinclude the introduction, deletion, or substitution of 10, 11, 12, 13,14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35,40, 45, 50, or 75 nucleotides at each target sequence of said cell(s)via the guide(s) RNA(s) or sgRNA(s) or crRNA(s). The mutations caninclude the introduction, deletion, or substitution of 20, 21, 22, 23,24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, or 75 nucleotides at eachtarget sequence of said cell(s) via the guide(s) RNA(s) or sgRNA(s) orcrRNA(s). The mutations can include the introduction, deletion, orsubstitution of 40, 45, 50, 75, 100, 200, 300, 400 or 500 nucleotides ateach target sequence of said cell(s) via the guide(s) RNA(s) or sgRNA(s)or crRNAs.

For minimization of toxicity and off-target effect, it will be importantto control the concentration of Cas13 mRNA and guide RNA delivered.Optimal concentrations of Cas13 mRNA and guide RNA can be determined bytesting different concentrations in a cellular or non-human eukaryoteanimal model and using deep sequencing the analyze the extent ofmodification at potential off-target genomic loci. Guide sequences andstrategies to minimize toxicity and off-target effects can be as in WO2014/093622 (PCT/US2013/074667); or, via mutation as herein.

The nucleic acid molecule encoding a Cas13 is advantageously codonoptimized. An example of a codon optimized sequence, is in this instancea sequence optimized for expression in a eukaryote, e.g., humans (i.e.being optimized for expression in humans), or for another eukaryote,animal or mammal as herein discussed; see, e.g., SaCas9 human codonoptimized sequence in WO 2014/093622 (PCT/US2013/074667). Whilst this ispreferred, it will be appreciated that other examples are possible andcodon optimization for a host species other than human, or for codonoptimization for specific organs is known. In some embodiments, anenzyme coding sequence encoding a Cas is codon optimized for expressionin particular cells, such as eukaryotic cells. The eukaryotic cells maybe those of or derived from a particular organism, such as a mammal,including but not limited to human, or non-human eukaryote or animal ormammal as herein discussed, e.g., mouse, rat, rabbit, dog, livestock, ornon-human mammal or primate. In some embodiments, processes formodifying the germ line genetic identity of human beings and/orprocesses for modifying the genetic identity of animals which are likelyto cause them suffering without any substantial medical benefit to manor animal, and also animals resulting from such processes, may beexcluded. In general, codon optimization refers to a process ofmodifying a nucleic acid sequence for enhanced expression in the hostcells of interest by replacing at least one codon (e.g. about or morethan about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of thenative sequence with codons that are more frequently or most frequentlyused in the genes of that host cell while maintaining the native aminoacid sequence. Various species exhibit particular bias for certaincodons of a particular amino acid. Codon bias (differences in codonusage between organisms) often correlates with the efficiency oftranslation of messenger RNA (mRNA), which is in turn believed to bedependent on, among other things, the properties of the codons beingtranslated and the availability of particular transfer RNA (tRNA)molecules. The predominance of selected tRNAs in a cell is generally areflection of the codons used most frequently in peptide synthesis.Accordingly, genes can be tailored for optimal gene expression in agiven organism based on codon optimization. Codon usage tables arereadily available, for example, at the “Codon Usage Database” availableat www.kazusa.orjp/codon/ and these tables can be adapted in a number ofways. See Nakamura, Y., et al. “Codon usage tabulated from theinternational DNA sequence databases: status for the year 2000” Nucl.Acids Res. 28:292 (2000). Computer algorithms for codon optimizing aparticular sequence for expression in a particular host cell are alsoavailable, such as Gene Forge (Aptagen; Jacobus, Pa.), are alsoavailable. In some embodiments, one or more codons (e.g. 1, 2, 3, 4, 5,10, 15, 20, 25, 50, or more, or all codons) in a sequence encoding a Cascorrespond to the most frequently used codon for a particular aminoacid.

In some embodiments, the unmodified RNA-targeting effector protein(Cas13) may have cleavage activity. In some embodiments, Cas13 maydirect cleavage of one or two nucleic acid strands at the location of ornear a target sequence, such as within the target sequence and/or withinthe complement of the target sequence or at sequences associated withthe target sequence, e.g., within about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,15, 20, 25, 50, 100, 200, 500, or more base pairs from the first or lastnucleotide of a target sequence. In some embodiments, the Cas13 proteinmay direct more than one cleavage (such as one, two three, four, five,or more cleavages) of one or two strands within the target sequenceand/or within the complement of the target sequence or at sequencesassociated with the target sequence and/or within about 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 15, 20, 25, 50, 100, 200, 500, or more base pairs fromthe first or last nucleotide of a target sequence. In some embodiments,the cleavage may be blunt, i.e., generating blunt ends. In someembodiments, the cleavage may be staggered, i.e., generating stickyends. In some embodiments, a vector encodes a nucleic acid-targetingCas13 protein that may be mutated with respect to a correspondingwild-type enzyme such that the mutated nucleic acid-targeting Cas13protein lacks the ability to cleave one or two strands of a targetpolynucleotide containing a target sequence, e.g., alteration ormutation in a HEPN domain to produce a mutated Cas13 substantiallylacking all RNA cleavage activity, e.g., the RNA cleavage activity ofthe mutated enzyme is about no more than 25%, 10%, 5%, 1%, 0.1%, 0.01%,or less of the nucleic acid cleavage activity of the non-mutated form ofthe enzyme; an example can be when the nucleic acid cleavage activity ofthe mutated form is nil or negligible as compared with the non-mutatedform. By derived, Applicants mean that the derived enzyme is largelybased, in the sense of having a high degree of sequence homology with, awildtype enzyme, but that it has been mutated (modified) in some way asknown in the art or as described herein.

Typically, in the context of an endogenous RNA-targeting system,formation of a RNA-targeting complex (comprising a guide RNA or crRNAhybridized to a target sequence and complexed with one or moreRNA-targeting effector proteins) results in cleavage of RNA strand(s) inor near (e.g., within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or morebase pairs from) the target sequence. As used herein the term“sequence(s) associated with a target locus of interest” refers tosequences near the vicinity of the target sequence (e.g. within 1, 2, 3,4, 5, 6, 7, 8, 9, 10, 20, 50, or more base pairs from the targetsequence, wherein the target sequence is comprised within a target locusof interest).

An example of a codon optimized sequence, is in this instance a sequenceoptimized for expression in a eukaryote, e.g., humans (i.e. beingoptimized for expression in humans), or for another eukaryote, animal ormammal as herein discussed; see, e.g., SaCas9 human codon optimizedsequence in WO 2014/093622 (PCT/US2013/074667) as an example of a codonoptimized sequence (from knowledge in the art and this disclosure, codonoptimizing coding nucleic acid molecule(s), especially as to effectorprotein (e.g., Cas13) is within the ambit of the skilled artisan).Whilst this is preferred, it will be appreciated that other examples arepossible and codon optimization for a host species other than human, orfor codon optimization for specific organs is known. In someembodiments, an enzyme coding sequence encoding a RNA-targeting Cas13protein is codon optimized for expression in particular cells, such aseukaryotic cells. The eukaryotic cells may be those of or derived from aparticular organism, such as a mammal, including but not limited tohuman, or non-human eukaryote or animal or mammal as herein discussed,e.g., mouse, rat, rabbit, dog, livestock, or non-human mammal orprimate. In some embodiments, processes for modifying the germ linegenetic identity of human beings and/or processes for modifying thegenetic identity of animals which are likely to cause them sufferingwithout any substantial medical benefit to man or animal, and alsoanimals resulting from such processes, may be excluded. In general,codon optimization refers to a process of modifying a nucleic acidsequence for enhanced expression in the host cells of interest byreplacing at least one codon (e.g., about or more than about 1, 2, 3, 4,5, 10, 15, 20, 25, 50, or more codons) of the native sequence withcodons that are more frequently or most frequently used in the genes ofthat host cell while maintaining the native amino acid sequence. Variousspecies exhibit particular bias for certain codons of a particular aminoacid. Codon bias (differences in codon usage between organisms) oftencorrelates with the efficiency of translation of messenger RNA (mRNA),which is in turn believed to be dependent on, among other things, theproperties of the codons being translated and the availability ofparticular transfer RNA (tRNA) molecules. The predominance of selectedtRNAs in a cell is generally a reflection of the codons used mostfrequently in peptide synthesis. Accordingly, genes can be tailored foroptimal gene expression in a given organism based on codon optimization.Codon usage tables are readily available, for example, at the “CodonUsage Database” available at www.kazusa.orjp/codon/ and these tables canbe adapted in a number of ways. See Nakamura, Y., et al. “Codon usagetabulated from the international DNA sequence databases: status for theyear 2000” Nucl. Acids Res. 28:292 (2000). Computer algorithms for codonoptimizing a particular sequence for expression in a particular hostcell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.), arealso available. In some embodiments, one or more codons (e.g., 1, 2, 3,4, 5, 10, 15, 20, 25, 50, or more, or all codons) in a sequence encodinga DNA/RNA-targeting Cas protein corresponds to the most frequently usedcodon for a particular amino acid.

The (i) Cas13 or nucleic acid molecule(s) encoding it or (ii) crRNA canbe delivered separately; and advantageously at least one or both of oneof (i) and (ii), e.g., an assembled complex is delivered via a particleor nanoparticle complex. RNA-targeting effector protein mRNA can bedelivered prior to the RNA-targeting guide RNA or crRNA to give time fornucleic acid-targeting effector protein to be expressed. RNA-targetingeffector protein (Cas13) mRNA might be administered 1-12 hours(preferably around 2-6 hours) prior to the administration ofRNA-targeting guide RNA or crRNA. Alternatively, RNA-targeting effectorprotein mRNA and RNA-targeting guide RNA or crRNA can be administeredtogether. Advantageously, a second booster dose of guide RNA or crRNAcan be administered 1-12 hours (preferably around 2-6 hours) after theinitial administration of RNA-targeting effector (Cas13) proteinmRNA+guide RNA. Additional administrations of RNA-targeting effectorprotein mRNA and/or guide RNA or crRNA might be useful to achieve themost efficient levels of genome modification.

In one aspect, the invention provides methods for using one or moreelements of a RNA-targeting system. The RNA-targeting complex of theinvention provides an effective means for modifying a target RNA singleor double stranded, linear or super-coiled. The RNA-targeting complex ofthe invention has a wide variety of utility including modifying (e.g.,deleting, inserting, translocating, inactivating, activating) a targetRNA in a multiplicity of cell types. As such the RNA-targeting complexof the invention has a broad spectrum of applications in, e.g., genetherapy, drug screening, disease diagnosis, and prognosis. An exemplaryRNA-targeting complex comprises a RNA-targeting effector proteincomplexed with a guide RNA or crRNA hybridized to a target sequencewithin the target locus of interest.

In one embodiment, this invention provides a method of cleaving a targetRNA. The method may comprise modifying a target RNA using aRNA-targeting complex that binds to the target RNA and effect cleavageof said target RNA. In an embodiment, the RNA-targeting complex of theinvention, when introduced into a cell, may create a break (e.g., asingle or a double strand break) in the RNA sequence. For example, themethod can be used to cleave a disease RNA in a cell. For example, anexogenous RNA template comprising a sequence to be integrated flanked byan upstream sequence and a downstream sequence may be introduced into acell. The upstream and downstream sequences share sequence similaritywith either side of the site of integration in the RNA. Where desired, adonor RNA can be mRNA. The exogenous RNA template comprises a sequenceto be integrated (e.g., a mutated RNA). The sequence for integration maybe a sequence endogenous or exogenous to the cell. Examples of asequence to be integrated include RNA encoding a protein or a non-codingRNA (e.g., a microRNA). Thus, the sequence for integration may beoperably linked to an appropriate control sequence or sequences.Alternatively, the sequence to be integrated may provide a regulatoryfunction. The upstream and downstream sequences in the exogenous RNAtemplate are selected to promote recombination between the RNA sequenceof interest and the donor RNA. The upstream sequence is a RNA sequencethat shares sequence similarity with the RNA sequence upstream of thetargeted site for integration. Similarly, the downstream sequence is aRNA sequence that shares sequence similarity with the RNA sequencedownstream of the targeted site of integration. The upstream anddownstream sequences in the exogenous RNA template can have 75%, 80%,85%, 90%, 95%, or 100% sequence identity with the targeted RNA sequence.Preferably, the upstream and downstream sequences in the exogenous RNAtemplate have about 95%, 96%, 97%, 98%, 99%, or 100% sequence identitywith the targeted RNA sequence. In some methods, the upstream anddownstream sequences in the exogenous RNA template have about 99% or100% sequence identity with the targeted RNA sequence. An upstream ordownstream sequence may comprise from about 20 bp to about 2500 bp, forexample, about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000,1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200,2300, 2400, or 2500 bp. In some methods, the exemplary upstream ordownstream sequence have about 200 bp to about 2000 bp, about 600 bp toabout 1000 bp, or more particularly about 700 bp to about 1000 bp. Insome methods, the exogenous RNA template may further comprise a marker.Such a marker may make it easy to screen for targeted integrations.Examples of suitable markers include restriction sites, fluorescentproteins, or selectable markers. The exogenous RNA template of theinvention can be constructed using recombinant techniques (see, forexample, Sambrook et al., 2001 and Ausubel et al., 1996). In a methodfor modifying a target RNA by integrating an exogenous RNA template, abreak (e.g., double or single stranded break in double or singlestranded RNA) is introduced into the RNA sequence by the nucleicacid-targeting complex, the break is repaired via homologousrecombination with an exogenous RNA template such that the template isintegrated into the RNA target. The presence of a double-stranded breakfacilitates integration of the template. In other embodiments, thisinvention provides a method of modifying expression of a RNA in aeukaryotic cell. The method comprises increasing or decreasingexpression of a target polynucleotide by using a nucleic acid-targetingcomplex that binds to the DNA or RNA (e.g., mRNA or pre-mRNA). In somemethods, a target RNA can be inactivated to affect the modification ofthe expression in a cell. For example, upon the binding of aRNA-targeting complex to a target sequence in a cell, the target RNA isinactivated such that the sequence is not translated, the coded proteinis not produced, or the sequence does not function as the wild-typesequence does. For example, a protein or microRNA coding sequence may beinactivated such that the protein or microRNA or pre-microRNA transcriptis not produced. The target RNA of a RNA-targeting complex can be anyRNA endogenous or exogenous to the eukaryotic cell. For example, thetarget RNA can be a RNA residing in the nucleus of the eukaryotic cell.The target RNA can be a sequence (e.g., mRNA or pre-mRNA) coding a geneproduct (e.g., a protein) or a non-coding sequence (e.g., ncRNA, lncRNA,tRNA, or rRNA). Examples of target RNA include a sequence associatedwith a signaling biochemical pathway, e.g., a signaling biochemicalpathway-associated RNA. Examples of target RNA include a diseaseassociated RNA. A “disease-associated” RNA refers to any RNA which isyielding translation products at an abnormal level or in an abnormalform in cells derived from a disease-affected tissues compared withtissues or cells of a non disease control. It may be a RNA transcribedfrom a gene that becomes expressed at an abnormally high level; it maybe a RNA transcribed from a gene that becomes expressed at an abnormallylow level, where the altered expression correlates with the occurrenceand/or progression of the disease. A disease-associated RNA also refersto a RNA transcribed from a gene possessing mutation(s) or geneticvariation that is directly responsible or is in linkage disequilibriumwith a gene(s) that is responsible for the etiology of a disease. Thetranslated products may be known or unknown, and may be at a normal orabnormal level. The target RNA of a RNA-targeting complex can be any RNAendogenous or exogenous to the eukaryotic cell. For example, the targetRNA can be a RNA residing in the nucleus of the eukaryotic cell. Thetarget RNA can be a sequence (e.g., mRNA or pre-mRNA) coding a geneproduct (e.g., a protein) or a non-coding sequence (e.g., ncRNA, lncRNA,tRNA, or rRNA).

In some embodiments, the method may comprise allowing a RNA-targetingcomplex to bind to the target RNA to effect cleavage of said target RNAthereby modifying the target RNA, wherein the RNA-targeting complexcomprises a nucleic acid-targeting effector (Cas13) protein complexedwith a guide RNA or crRNA hybridized to a target sequence within saidtarget RNA. In one aspect, the invention provides a method of modifyingexpression of RNA in a eukaryotic cell. In some embodiments, the methodcomprises allowing a RNA-targeting complex to bind to the RNA such thatsaid binding results in increased or decreased expression of said RNA;wherein the RNA-targeting complex comprises a nucleic acid-targetingeffector (Cas13) protein complexed with a guide RNA. Methods ofmodifying a target RNA can be in a eukaryotic cell, which may be invivo, ex vivo or in vitro. In some embodiments, the method comprisessampling a cell or population of cells from a human or non-human animal,and modifying the cell or cells. Culturing may occur at any stage exvivo. The cell or cells may even be re-introduced into the non-humananimal or plant. For re-introduced cells it is particularly preferredthat the cells are stem cells.

The use of two different aptamers (each associated with a distinctRNA-targeting guide RNAs) allows an activator-adaptor protein fusion anda repressor-adaptor protein fusion to be used, with differentRNA-targeting guide RNAs or crRNAs, to activate expression of RNA,whilst repressing another. They, along with their different guide RNAsor crRNAs can be administered together, or substantially together, in amultiplexed approach. A large number of such modified RNA-targetingguide RNAs or crRNAs can be used all at the same time, for example 10 or20 or 30 and so forth, whilst only one (or at least a minimal number) ofeffector protein (Cas13) molecules need to be delivered, as acomparatively small number of effector protein molecules can be usedwith a large number of modified guides. The adaptor protein may beassociated (preferably linked or fused to) one or more activators or oneor more repressors. For example, the adaptor protein may be associatedwith a first activator and a second activator. The first and secondactivators may be the same, but they are preferably differentactivators. Three or more or even four or more activators (orrepressors) may be used, but package size may limit the number beinghigher than 5 different functional domains. Linkers are preferably used,over a direct fusion to the adaptor protein, where two or morefunctional domains are associated with the adaptor protein. Suitablelinkers might include the GlySer linker.

It is also envisaged that the RNA-targeting effector protein-guide RNAcomplex as a whole may be associated with two or more functionaldomains. For example, there may be two or more functional domainsassociated with the RNA-targeting effector protein, or there may be twoor more functional domains associated with the guide RNA or crRNA (viaone or more adaptor proteins), or there may be one or more functionaldomains associated with the RNA-targeting effector protein and one ormore functional domains associated with the guide RNA or crRNA (via oneor more adaptor proteins).

The fusion between the adaptor protein and the activator or repressormay include a linker. For example, GlySer linkers GGGS can be used. Theycan be used in repeats of 3 ((GGGGS)₃ (SEQ ID NO:79)) or 6, 9 or even 12or more, to provide suitable lengths, as required. Linkers can be usedbetween the guide RNAs and the functional domain (activator orrepressor), or between the nucleic acid-targeting effector protein andthe functional domain (activator or repressor). The linkers the user toengineer appropriate amounts of “mechanical flexibility”.

CRISPR effector (Cas13) protein or mRNA therefor (or more generally anucleic acid molecule therefor) and guide RNA or crRNA might also bedelivered separately e.g., the former 1-12 hours (preferably around 2-6hours) prior to the administration of guide RNA or crRNA, or together. Asecond booster dose of guide RNA or crRNA can be administered 1-12 hours(preferably around 2-6 hours) after the initial administration.

The Cas13 effector protein is sometimes referred to herein as a CRISPREnzyme. It will be appreciated that the effector protein is based on orderived from an enzyme, so the term ‘effector protein’ certainlyincludes ‘enzyme’ in some embodiments. However, it will also beappreciated that the effector protein may, as required in someembodiments, have DNA or RNA binding, but not necessarily cutting ornicking, activity, including a dead-Cas effector protein function.

Cellular targets include Hemopoietic Stem/Progenitor Cells (CD34+);Human T cells; and Eye (retinal cells)—for example photoreceptorprecursor cells.

Inventive methods can further comprise delivery of templates. Deliveryof templates may be via the cotemporaneous or separate from delivery ofany or all the CRISPR effector protein (Cas13) or guide or crRNA and viathe same delivery mechanism or different.

In certain embodiments, the methods as described herein may compriseproviding a Cas13 transgenic cell in which one or more nucleic acidsencoding one or more guide RNAs are provided or introduced operablyconnected in the cell with a regulatory element comprising a promoter ofone or more gene of interest. As used herein, the term “Cas13 transgeniccell” refers to a cell, such as a eukaryotic cell, in which a Cas13 genehas been genomically integrated. The nature, type, or origin of the cellare not particularly limiting according to the present invention. Alsothe way how the Cas13 transgene is introduced in the cell is may varyand can be any method as is known in the art. In certain embodiments,the Cas13 transgenic cell is obtained by introducing the Cas13 transgenein an isolated cell. In certain other embodiments, the Cas13 transgeniccell is obtained by isolating cells from a Cas13 transgenic organism. Bymeans of example, and without limitation, the Cas13 transgenic cell asreferred to herein may be derived from a Cas13 transgenic eukaryote,such as a Cas13 knock-in eukaryote. Reference is made to WO 2014/093622(PCT/US13/74667), incorporated herein by reference. Methods of US PatentPublication Nos. 20120017290 and 20110265198 assigned to SangamoBioSciences, Inc. directed to targeting the Rosa locus may be modifiedto utilize the CRISPR Cas system of the present invention. Methods of USPatent Publication No. 20130236946 assigned to Cellectis directed totargeting the Rosa locus may also be modified to utilize the CRISPR Cassystem of the present invention. By means of further example referenceis made to Platt et. al. (Cell; 159(2):440-455 (2014)), describing aCas9 knock-in mouse, which is incorporated herein by reference. TheCas13 transgene can further comprise a Lox-Stop-polyA-Lox(LSL) cassettethereby rendering Cas13 expression inducible by Cre recombinase.Alternatively, the Cas13 transgenic cell may be obtained by introducingthe Cas13 transgene in an isolated cell. Delivery systems for transgenesare well known in the art. By means of example, the Cas13 transgene maybe delivered in for instance eukaryotic cell by means of vector (e.g.,AAV, adenovirus, lentivirus) and/or particle and/or particle delivery,as also described herein elsewhere.

It will be understood by the skilled person that the cell, such as theCas13 transgenic cell, as referred to herein may comprise furthergenomic alterations besides having an integrated Cas13 gene or themutations arising from the sequence specific action of Cas13 whencomplexed with RNA capable of guiding Cas13 to a target locus, such asfor instance one or more oncogenic mutations, as for instance andwithout limitation described in Platt et al. (2014), Chen et al., (2014)or Kumar et al. (2009).

In some embodiments, the Cas13 sequence is fused to one or more nuclearlocalization sequences (NLSs), such as about or more than about 1, 2, 3,4, 5, 6, 7, 8, 9, 10, or more NLSs. In some embodiments, the Cas13comprises about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ormore NLSs at or near the amino-terminus, about or more than about 1, 2,3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the carboxy-terminus,or a combination of these (e.g. zero or at least one or more NLS at theamino-terminus and zero or at one or more NLS at the carboxy terminus).When more than one NLS is present, each may be selected independently ofthe others, such that a single NLS may be present in more than one copyand/or in combination with one or more other NLSs present in one or morecopies. In a preferred embodiment of the invention, the Cas13 comprisesat most 6 NLSs. In some embodiments, an NLS is considered near the N- orC-terminus when the nearest amino acid of the NLS is within about 1, 2,3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along thepolypeptide chain from the N- or C-terminus. Non-limiting examples ofNLSs include an NLS sequence derived from: the NLS of the SV40 viruslarge T-antigen, having the amino acid sequence PKKKRKV (SEQ ID NO: 80);the NLS from nucleoplasmin (e.g. the nucleoplasmin bipartite NLS withthe sequence KRPAATKKAGQAKKKK) (SEQ ID NO: 81); the c-myc NLS having theamino acid sequence PAAKRVKLD (SEQ ID NO: 82) or RQRRNELKRSP (SEQ ID NO:83); the hRNPA1 M9 NLS having the sequenceNQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO: 84); the sequenceRMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV (SEQ ID NO: 85) of the IBBdomain from importin-alpha; the sequences VSRKRPRP (SEQ ID NO: 86) andPPKKARED (SEQ ID NO: 87) of the myoma T protein; the sequence POPKKKPL(SEQ ID NO: 88) of human p53; the sequence SALIKKKKKMAP (SEQ ID NO: 89)of mouse c-abl IV; the sequences DRLRR (SEQ ID NO: 90) and PKQKKRK (SEQID NO: 91) of the influenza virus NS1; the sequence RKLKKKIKKL (SEQ IDNO: 92) of the Hepatitis virus delta antigen; the sequence REKKKFLKRR(SEQ ID NO: 93) of the mouse Mx1 protein; the sequenceKRKGDEVDGVDEVAKKKSKK (SEQ ID NO: 94) of the human poly(ADP-ribose)polymerase; and the sequence RKCLQAGMNLEARKTKK (SEQ ID NO: 95) of thesteroid hormone receptors (human) glucocorticoid. In general, the one ormore NLSs are of sufficient strength to drive accumulation of the Cas ina detectable amount in the nucleus of a eukaryotic cell. In general,strength of nuclear localization activity may derive from the number ofNLSs in the Cas, the particular NLS(s) used, or a combination of thesefactors. Detection of accumulation in the nucleus may be performed byany suitable technique. For example, a detectable marker may be fused tothe Cas, such that location within a cell may be visualized, such as incombination with a means for detecting the location of the nucleus (e.g.a stain specific for the nucleus such as DAPI). Cell nuclei may also beisolated from cells, the contents of which may then be analyzed by anysuitable process for detecting protein, such as immunohistochemistry,Western blot, or enzyme activity assay. Accumulation in the nucleus mayalso be determined indirectly, such as by an assay for the effect ofCRISPR complex formation (e.g. assay for DNA cleavage or mutation at thetarget sequence, or assay for altered gene expression activity affectedby CRISPR complex formation and/or Cas enzyme activity), as compared toa control no exposed to the Cas or complex, or exposed to a Cas lackingthe one or more NLSs.

The guide RNA(s), e.g., sgRNA(s) or crRNA(s) encoding sequences and/orCas13 encoding sequences, can be functionally or operatively linked toregulatory element(s) and hence the regulatory element(s) driveexpression. The promoter(s) can be constitutive promoter(s) and/orconditional promoter(s) and/or inducible promoter(s) and/or tissuespecific promoter(s). The promoter can be selected from the groupconsisting of RNA polymerases, pol I, pol II, pol III, T7, U6, H1,retroviral Rous sarcoma virus (RSV) LTR promoter, the cytomegalovirus(CMV) promoter, the SV40 promoter, the dihydrofolate reductase promoter,the β-actin promoter, the phosphoglycerol kinase (PGK) promoter, and theEF1a promoter. An advantageous promoter is the promoter is U6.

In some embodiments, a CRISPR effector (Cas 13n) protein may form acomponent of an inducible system. The inducible nature of the systemwould allow for spatiotemporal control of gene editing or geneexpression using a form of energy. The form of energy may include but isnot limited to electromagnetic radiation, sound energy, chemical energyand thermal energy. Examples of inducible system include tetracyclineinducible promoters (Tet-On or Tet-Off), small molecule two-hybridtranscription activations systems (FKBP, ABA, etc.), or light induciblesystems (Phytochrome, LOV domains, or cryptochrome). In one embodiment,the CRISPR effector protein may be a part of a Light InducibleTranscriptional Effector (LITE) to direct changes in transcriptionalactivity in a sequence-specific manner. The components of a light mayinclude a CRISPR effector protein, a light-responsive cytochromeheterodimer (e.g. from Arabidopsis thaliana), and a transcriptionalactivation/repression domain. Further examples of inducible DNA bindingproteins and methods for their use are provided in U.S. 61/736,465 andU.S. 61/721,283, and WO 2014018423 A2 which is hereby incorporated byreference in its entirety.

Whenever reference is made herein to Cas13, it will be understood that amutated Cas13 according to the invention as described herein is meant,unless explicitly indicated otherwise. Whenever reference is made hereinto Cas13, preferably a mutated Cas13a, Cas13b, Cas13c, or Cas13daccording to the invention as described herein is meant, unlessexplicitly indicated otherwise. Whenever reference is made herein toCas13, preferably a mutated Cas13b according to the invention asdescribed herein is meant, unless explicitly indicated otherwise.

In one aspect, the invention provides a mutated Cas13 as describedherein, such as preferably, but without limitation Cas13b as describedherein elsewhere, having one or more mutations resulting in reducedoff-target effects, i.e. improved CRISPR enzymes for use in effectingmodifications to target loci but which reduce or eliminate activitytowards off-targets, such as when complexed to guide RNAs, as well asimproved CRISPR enzymes for increasing the activity of CRISPR enzymes,such as when complexed with guide RNAs. It is to be understood thatmutated enzymes as described herein below may be used in any of themethods according to the invention as described herein elsewhere. Any ofthe methods, products, compositions and uses as described hereinelsewhere are equally applicable with the mutated CRISPR enzymes asfurther detailed below.

Slaymaker et al. recently described a method for the generation of Cas9orthologues with enhanced specificity (Slaymaker et al. 2015 “Rationallyengineered Cas9 nucleases with improved specificity”). This strategy canbe used to enhance the specificity of the Cas13 protein. Primaryresidues for mutagenesis are preferably all positive charges residueswithin the HEPN domain. Additional residues are positive chargedresidues that are conserved between different orthologues.

In an aspect, the invention also provides methods and mutations formodulating Cas13 binding activity and/or binding specificity. In certainembodiments Cas13 proteins lacking nuclease activity are used. Incertain embodiments, modified guide RNAs are employed that promotebinding but not nuclease activity of a Cas13 nuclease. In suchembodiments, on-target binding can be increased or decreased. Also, insuch embodiments off-target binding can be increased or decreased.Moreover, there can be increased or decreased specificity as toon-target binding vs. off-target binding.

The methods and mutations which can be employed in various combinationsto increase or decrease activity and/or specificity of on-target vs.off-target activity, or increase or decrease binding and/or specificityof on-target vs. off-target binding, can be used to compensate orenhance mutations or modifications made to promote other effects. Suchmutations or modifications made to promote other effects in includemutations or modification to the Cas13 and or mutation or modificationmade to a guide RNA. The methods and mutations of the invention are usedto modulate Cas13 nuclease activity and/or binding with chemicallymodified guide RNAs.

In an aspect, the invention provides methods and mutations formodulating binding and/or binding specificity of Cas13 proteinsaccording to the invention as defined herein comprising functionaldomains such as nucleases, transcriptional activators, transcriptionalrepressors, and the like. For example, a Cas13 protein can be madenuclease-null, or having altered or reduced nuclease activity byintroducing mutations such as for instance Cas13 mutations describedherein elsewhere. Nuclease deficient Cas13 proteins are useful forRNA-guided target sequence dependent delivery of functional domains. Theinvention provides methods and mutations for modulating binding of Cas13proteins. In one embodiment, the functional domain comprises VP64,providing an RNA-guided transcription factor. In another embodiment, thefunctional domain comprises Fok I, providing an RNA-guided nucleaseactivity. Mention is made of U.S. Pat. Pub. 2014/0356959, U.S. Pat. Pub.2014/0342456, U.S. Pat. Pub. 2015/0031132, and Mali, P. et al., 2013,Science 339(6121):823-6, doi: 10.1126/science.1232033, published online3 Jan. 2013 and through the teachings herein the invention comprehendsmethods and materials of these documents applied in conjunction with theteachings herein. In certain embodiments, on-target binding isincreased. In certain embodiments, off-target binding is decreased. Incertain embodiments, on-target binding is decreased. In certainembodiments, off-target binding is increased. Accordingly, the inventionalso provides for increasing or decreasing specificity of on-targetbinding vs. off-target binding of functionalized Cas13 binding proteins.

The use of Cas13 as an RNA-guided binding protein is not limited tonuclease-null Ca13. Cas13 enzymes comprising nuclease activity can alsofunction as RNA-guided binding proteins when used with certain guideRNAs. For example short guide RNAs and guide RNAs comprising nucleotidesmismatched to the target can promote RNA directed Cas13 binding to atarget sequence with little or no target cleavage. (See, e.g., Dahlman,2015, Nat Biotechnol. 33(11):1159-1161, doi: 10.1038/nbt.3390, publishedonline 5 Oct. 2015). In an aspect, the invention provides methods andmutations for modulating binding of Cas13 proteins that comprisenuclease activity. In certain embodiments, on-target binding isincreased. In certain embodiments, off-target binding is decreased. Incertain embodiments, on-target binding is decreased. In certainembodiments, off-target binding is increased. In certain embodiments,there is increased or decreased specificity of on-target binding vs.off-target binding. In certain embodiments, nuclease activity of guideRNA-Cas13 enzyme is also modulated.

RNA-RNA duplex formation is important for cleavage activity andspecificity throughout the target region, not only the seed regionsequence closest to the PAM. Thus, truncated guide RNAs show reducedcleavage activity and specificity. In an aspect, the invention providesmethod and mutations for increasing activity and specificity of cleavageusing altered guide RNAs.

In certain embodiments, the catalytic activity of the CRISPR-Cas protein(e.g., Cas13) of the invention is altered or modified. It is to beunderstood that mutated Cas13 has an altered or modified catalyticactivity if the catalytic activity is different than the catalyticactivity of the corresponding wild type CRISPR-Cas protein (e.g.,unmutated CRISPR-Cas protein). Catalytic activity can be determined bymeans known in the art. By means of example, and without limitation,catalytic activity can be determined in vitro or in vivo bydetermination of indel percentage (for instance after a given time, orat a given dose). In certain embodiments, catalytic activity isincreased. In certain embodiments, catalytic activity is increased by atleast 5%, preferably at least 10%, more preferably at least 20%, such asat least 30%, at least 40%, at least 50%, at least 60%, at least 70%, atleast 80%, at least 90%, or at least 100%. In certain embodiments,catalytic activity is decreased. In certain embodiments, catalyticactivity is decreased by at least 5%, preferably at least 10%, morepreferably at least 20%, such as at least 30%, at least 40%, at least50%, at least 60%, at least 70%, at least 80%, at least 90%, or(substantially) 100%. The one or more mutations herein may inactivatethe catalytic activity, which may substantially all catalytic activity,below detectable levels, or no measurable catalytic activity.

One or more characteristics of the engineered CRISPR-Cas protein may bedifferent from a corresponding wiled type CRISPR-Cas protein. Examplesof such characteristics include catalytic activity, gRNA binding,specificity of the CRISPR-Cas protein (e.g., specificity of editing adefined target), stability of the CRISPR-Cas protein, off-targetbinding, target binding, protease activity, nickase activity, PFSrecognition. In some examples, a engineered CRISPR-Cas protein maycomprise one or more mutations of the corresponding wild type CRISPR-Casprotein. In some embodiments, the catalytic activity of the engineeredCRISPR-Cas protein is increased as compared to a corresponding wildtypeCRISPR-Cas protein. In some embodiments, the catalytic activity of theengineered CRISPR-Cas protein is decreased as compared to acorresponding wildtype CRISPR-Cas protein. In some embodiments, the gRNAbinding of the engineered CRISPR-Cas protein is increased as compared toa corresponding wildtype CRISPR-Cas protein. In some embodiments, thegRNA binding of the engineered CRISPR-Cas protein is decreased ascompared to a corresponding wildtype CRISPR-Cas protein. In someembodiments, the specificity of the CRISPR-Cas protein is increased ascompared to a corresponding wildtype CRISPR-Cas protein. In someembodiments, the specificity of the CRISPR-Cas protein is decreased ascompared to a corresponding wildtype CRISPR-Cas protein. In someembodiments, the stability of the CRISPR-Cas protein is increased ascompared to a corresponding wildtype CRISPR-Cas protein. In someembodiments, the stability of the CRISPR-Cas protein is decreased ascompared to a corresponding wildtype CRISPR-Cas protein. In someembodiments, the engineered CRISPR-Cas protein further comprises one ormore mutations which inactivate catalytic activity. In some embodiments,the off-target binding of the CRISPR-Cas protein is increased ascompared to a corresponding wildtype CRISPR-Cas protein. In someembodiments, the off-target binding of the CRISPR-Cas protein isdecreased as compared to a corresponding wildtype CRISPR-Cas protein. Insome embodiments, the target binding of the CRISPR-Cas protein isincreased as compared to a corresponding wildtype CRISPR-Cas protein. Insome embodiments, the target binding of the CRISPR-Cas protein isdecreased as compared to a corresponding wildtype CRISPR-Cas protein. Insome embodiments, the engineered CRISPR-Cas protein has a higherprotease activity or polynucleotide-binding capability compared with acorresponding wildtype CRISPR-Cas protein. In some embodiments, the PFSrecognition is altered as compared to a corresponding wildtypeCRISPR-Cas protein.

In certain embodiments, the gRNA (crRNA) binding of the Cas13 protein ofthe invention is altered or modified. It is to be understood thatmutated Cas13 has an altered or modified gRNA binding if the gRNAbinding is different than the gRNA binding of the corresponding wildtype Cas13 (i.e. unmutated Cas13).gRNA binding can be determined bymeans known in the art. By means of example, and without limitation,gRNA binding can be determined by calculating binding strength oraffinity (such as based on equilibrium constants, Ka, Kd, etc). Incertain embodiments, gRNA binding is increased. In certain embodiments,gRNA binding is increased by at least 5%, preferably at least 10%, morepreferably at least 20%, such as at least 30%, at least 40%, at least50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least100%. In certain embodiments, gRNA binding is decreased. In certainembodiments, gRNA binding is decreased by at least 5%, preferably atleast 10%, more preferably at least 20%, such as at least 30%, at least40%, at least 50%, at least 60%, at least 70%, at least 80%, at least90%, or (substantially) 100%.

In certain embodiments, the specificity of the Cas13 protein of theinvention is altered or modified. It is to be understood that mutatedCas13 has an altered or modified specificity if the specificity isdifferent than the specificity of the corresponding wild type Cas13(i.e. unmutated Cas13). Specificity can be determined by means known inthe art. By means of example, and without limitation, specificity can bedetermined by comparison of on-target activity and off-target activity.In certain embodiments, specificity is increased. In certainembodiments, specificity is increased by at least 5%, preferably atleast 10%, more preferably at least 20%, such as at least 30%, at least40%, at least 50%, at least 60%, at least 70%, at least 80%, at least90%, or at least 100%. In certain embodiments, specificity is decreased.In certain embodiments, specificity is decreased by at least 5%,preferably at least 10%, more preferably at least 20%, such as at least30%, at least 40%, at least 50%, at least 60%, at least 70%, at least80%, at least 90%, or (substantially) 100%.

In certain embodiments, the stability of the Cas13 protein of theinvention is altered or modified. It is to be understood that mutatedCas13 has an altered or modified stability if the stability is differentthan the stability of the corresponding wild type Cas13 (i.e. unmutatedCas13). Stability can be determined by means known in the art. By meansof example, and without limitation, stability can be determined bydetermining the half-life of the Cas13 protein. In certain embodiments,stability is increased. In certain embodiments, stability is increasedby at least 5%, preferably at least 10%, more preferably at least 20%,such as at least 30%, at least 40%, at least 50%, at least 60%, at least70%, at least 80%, at least 90%, or at least 100%. In certainembodiments, stability is decreased. In certain embodiments, stabilityis decreased by at least 5%, preferably at least 10%, more preferably atleast 20%, such as at least 30%, at least 40%, at least 50%, at least60%, at least 70%, at least 80%, at least 90%, or (substantially) 100%.

In certain embodiments, the target binding of the Cas13 protein of theinvention is altered or modified. It is to be understood that mutatedCas13 has an altered or modified target binding if the target binding isdifferent than the target binding of the corresponding wild type Cas13(i.e. unmutated Cas13). target binding can be determined by means knownin the art. By means of example, and without limitation, target bindingcan be determined by calculating binding strength or affinity (such asbased on equilibrium constants, Ka, Kd, etc). In certain embodiments,target bindings increased. In certain embodiments, target binding isincreased by at least 5%, preferably at least 10%, more preferably atleast 20%, such as at least 30%, at least 40%, at least 50%, at least60%, at least 70%, at least 80%, at least 90%, or at least 100%. Incertain embodiments, target binding is decreased. In certainembodiments, target binding is decreased by at least 5%, preferably atleast 10%, more preferably at least 20%, such as at least 30%, at least40%, at least 50%, at least 60%, at least 70%, at least 80%, at least90%, or (substantially) 100%.

In certain embodiments, the off-target binding of the Cas13 protein ofthe invention is altered or modified. It is to be understood thatmutated Cas13 has an altered or modified off-target binding if theoff-target binding is different than the off-target binding of thecorresponding wild type Cas13 (i.e. unmutated Cas13). Off-target bindingcan be determined by means known in the art. By means of example, andwithout limitation, off-target binding can be determined by calculatingbinding strength or affinity (such as based on equilibrium constants,Ka, Kd, etc). In certain embodiments, off-target bindings increased. Incertain embodiments, off-target binding is increased by at least 5%,preferably at least 10%, more preferably at least 20%, such as at least30%, at least 40%, at least 50%, at least 60%, at least 70%, at least80%, at least 90%, or at least 100%. In certain embodiments, off-targetbinding is decreased. In certain embodiments, off-target binding isdecreased by at least 5%, preferably at least 10%, more preferably atleast 20%, such as at least 30%, at least 40%, at least 50%, at least60%, at least 70%, at least 80%, at least 90%, or (substantially) 100%.

In certain embodiments, the PFS (or PAM) recognition or specificity ofthe Cas13 protein of the invention is altered or modified. It is to beunderstood that mutated Cas13 has an altered or modified PFS recognitionor specificity if the PFS recognition or specificity is different thanthe PFS recognition or specificity of the corresponding wild type Cas13(i.e. unmutated Cas13). PFS recognition or specificity can be determinedby means known in the art. By means of example, and without limitation,PFS recognition or specificity can be determined by PFS (PAM) screens.In certain embodiments, at least one different PFS is recognized by theCas13. In certain embodiments, at least one PFS is recognized by themutated Cas13 which is not recognized by the corresponding wild typeCas13. In certain embodiments, at least one PFS is recognized by themutated Cas13 which is not recognized by the corresponding wild typeCas13, in addition to the wild type PFS. In certain embodiments, atleast one PFS is recognized by the mutated Cas13 which is not recognizedby the corresponding wild type Cas13, and the wild type PFS is notanymore recognized. In certain embodiments, the PFS recognized by themutated Cas13 is longer than the PFS recognized by the wild type Cas13,such as 1, 2, or 3 nucleotides longer. In certain embodiments, the PFSrecognized by the mutated Cas13 is shorter than the PFS recognized bythe wild type Cas13, such as 1, 2, or 3 nucleotides shorter.

The invention provides a non-naturally occurring or engineeredcomposition comprising

i) a mutated Cas13 effector protein, andii) a crRNA,wherein the crRNA comprises a) a guide sequence that is capable ofhybridizing to a target RNA sequence, and b) a direct repeat sequence,

whereby there is formed a CRISPR complex comprising the Cas13 effectorprotein complexed with the guide sequence that is hybridized to thetarget RNA sequence. The complex can be formed in vitro or ex vivo andintroduced into a cell or contacted with RNA; or can be formed in vivo.

In some embodiments, such as for Cas13b, a non-naturally occurring orengineered composition of the invention may comprise an accessoryprotein that enhances Type VI-B CRISPR-Cas effector protein activity.

In certain such embodiments, the accessory protein that enhances Cas13beffector protein activity is a csx28 protein. In such embodiments, theType VI-B CRISPR-Cas effector protein and the Type VI-B CRISPR-Casaccessory protein may be from the same source or from a differentsource.

In some embodiments, a non-naturally occurring or engineered compositionof the invention comprises an accessory protein that represses Cas13beffector protein activity.

In certain such embodiments, the accessory protein that represses Cas13beffector protein activity is a csx27 protein. In such embodiments, theType VI-B CRISPR-Cas effector protein and the Type VI-B CRISPR-Casaccessory protein may be from the same source or from a differentsource. In certain embodiments of the invention, the Type VI-BCRISPR-Cas effector protein is from Table 1.

In some embodiments, a non-naturally occurring or engineered compositionof the invention comprises two or more crRNAs.

In some embodiments, a non-naturally occurring or engineered compositionof the invention comprises a guide sequence that hybridizes to a targetRNA sequence in a prokaryotic cell.

In some embodiments, a non-naturally occurring or engineered compositionof the invention comprises a guide sequence that hybridizes to a targetRNA sequence in a eukaryotic cell.

In some embodiment, the Cas13 effector protein comprises one or morenuclear localization signals (NLSs).

In certain embodiments, the Cas13 effector protein of the invention is,or in, or comprises, or consists essentially of, or consists of, orinvolves or relates to such a protein derived from or as set forth inTables 1-4, and comprising one or more mutation of the invention asdescribed herein elsewhere.

In some embodiment of the non-naturally occurring or engineeredcomposition of the invention, the Cas13 effector protein is associatedwith one or more functional domains. The association can be by directlinkage of the effector protein to the functional domain, or byassociation with the crRNA. In a non-limiting example, the crRNAcomprises an added or inserted sequence that can be associated with afunctional domain of interest, including, for example, an aptamer or anucleotide that binds to a nucleic acid binding adapter protein. Thefunctional domain may be a functional heterologous domain.

In certain non-limiting embodiments, a non-naturally occurring orengineered composition of the invention comprises a functional domaincleaves the target RNA sequence.

In certain non-limiting embodiments, the non-naturally occurring orengineered composition of the invention comprises a functional domainthat modifies transcription or translation of the target RNA sequence.

In some embodiment of the composition of the invention, the Cas13effector protein is associated with one or more functional domains; andthe effector protein contains one or more mutations within an HEPNdomain, whereby the complex can deliver an epigenetic modifier or atranscriptional or translational activation or repression signal. Thecomplex can be formed in vitro or ex vivo and introduced into a cell orcontacted with RNA; or can be formed in vivo.

In some embodiment of the non-naturally occurring or engineeredcomposition of the invention, the Cas13b effector protein and theaccessory protein are from the same organism.

In some embodiment of the non-naturally occurring or engineeredcomposition of the invention, the Cas13b effector protein and theaccessory protein are from different organisms.

The invention also provides a Type VI CRISPR-Cas vector system, whichcomprises one or more vectors comprising:

a first regulatory element operably linked to a nucleotide sequenceencoding the Cas13 effector protein, and a second regulatory elementoperably linked to a nucleotide sequence encoding the crRNA.

In certain embodiments, the vector system of the invention furthercomprises a regulatory element operably linked to a nucleotide sequenceof a Type VI-B CRISPR-Cas accessory protein.

When appropriate, the nucleotide sequence encoding the Type VICRISPR-Cas effector protein (and/or optionally the nucleotide sequenceencoding the Type VI-B CRISPR-Cas accessory protein) is codon optimizedfor expression in a eukaryotic cell.

In some embodiment of the vector system of the invention, the nucleotidesequences encoding the Cas13 effector protein (and optionally) theaccessory protein are codon optimized for expression in a eukaryoticcell.

In some embodiment, the vector system of the invention comprises in asingle vector.

In some embodiment of the vector system of the invention, the one ormore vectors comprise viral vectors.

In some embodiment of the vector system of the invention, the one ormore vectors comprise one or more retroviral, lentiviral, adenoviral,adeno-associated or herpes simplex viral vectors.

The invention provides a delivery system configured to deliver a Cas13effector protein and one or more nucleic acid components of anon-naturally occurring or engineered composition comprising

i) a mutated Cas13 effector protein according to the invention asdescribed herein, and

ii) a crRNA,

wherein the crRNA comprises a) a guide sequence that hybridizes to atarget RNA sequence in a cell, and b) a direct repeat sequence,

wherein the Cas13 effector protein forms a complex with the crRNA,

wherein the guide sequence directs sequence-specific binding to thetarget RNA sequence,

whereby there is formed a CRISPR complex comprising the Cas13 effectorprotein complexed with the guide sequence that is hybridized to thetarget RNA sequence. The complex can be formed in vitro or ex vivo andintroduced into a cell or contacted with RNA; or can be formed in vivo.

In some embodiment of the delivery system of the invention, the systemcomprises one or more vectors or one or more polynucleotide molecules,the one or more vectors or polynucleotide molecules comprising one ormore polynucleotide molecules encoding the Cas13 effector protein andone or more nucleic acid components of the non-naturally occurring orengineered composition.

In some embodiment, the delivery system of the invention comprises adelivery vehicle comprising liposome(s), particle(s), exosome(s),microvesicle(s), a gene-gun or one or more viral vector(s).

In some embodiment, the non-naturally occurring or engineeredcomposition of the invention is for use in a therapeutic method oftreatment or in a research program.

In some embodiment, the non-naturally occurring or engineered vectorsystem of the invention is for use in a therapeutic method of treatmentor in a research program.

In some embodiment, the non-naturally occurring or engineered deliverysystem of the invention is for use in a therapeutic method of treatmentor in a research program.

The invention provides a method of modifying expression of a target geneof interest, the method comprising contacting a target RNA with one ormore non-naturally occurring or engineered compositions comprising

i) a mutated Cas13 effector protein according to the invention asdescribed herein, and

ii) a crRNA,

wherein the crRNA comprises a) a guide sequence that hybridizes to atarget RNA sequence in a cell, and b) a direct repeat sequence,

wherein the Cas13 effector protein forms a complex with the crRNA,

wherein the guide sequence directs sequence-specific binding to thetarget RNA sequence in a cell,

whereby there is formed a CRISPR complex comprising the Cas13 effectorprotein complexed with the guide sequence that is hybridized to thetarget RNA sequence,

whereby expression of the target locus of interest is modified. Thecomplex can be formed in vitro or ex vivo and introduced into a cell orcontacted with RNA; or can be formed in vivo.

In some embodiment, the method of modifying expression of a target geneof interest further comprises contacting the target RNA with anaccessory protein that enhances Cas13b effector protein activity.

In some embodiment of the method of modifying expression of a targetgene of interest, the accessory protein that enhances Cas13b effectorprotein activity is a csx28 protein.

In some embodiment, the method of modifying expression of a target geneof interest further comprises contacting the target RNA with anaccessory protein that represses Cas13b effector protein activity.

In some embodiment of the method of modifying expression of a targetgene of interest, the accessory protein that represses Cas13b effectorprotein activity is a csx27 protein.

In some embodiment, the method of modifying expression of a target geneof interest comprises cleaving the target RNA.

In some embodiment, the method of modifying expression of a target geneof interest comprises increasing or decreasing expression of the targetRNA.

In some embodiment of the method of modifying expression of a targetgene of interest, the target gene is in a prokaryotic cell.

In some embodiment of the method of modifying expression of a targetgene of interest, the target gene is in a eukaryotic cell.

The invention provides a cell comprising a modified target of interest,wherein the target of interest has been modified according to any of themethod disclosed herein.

In some embodiment of the invention, the cell is a prokaryotic cell.

In some embodiment of the invention, the cell is a eukaryotic cell.

In some embodiment, modification of the target of interest in a cellresults in:

a cell comprising altered expression of at least one gene product;a cell comprising altered expression of at least one gene product,wherein the expression of the at least one gene product is increased; ora cell comprising altered expression of at least one gene product,wherein the expression of the at least one gene product is decreased.

In some embodiment, the cell is a mammalian cell or a human cell.

The invention provides a cell line of or comprising a cell disclosedherein or a cell modified by any of the methods disclosed herein, orprogeny thereof.

The invention provides a multicellular organism comprising one or morecells disclosed herein or one or more cells modified according to any ofthe methods disclosed herein.

The invention provides a plant or animal model comprising one or morecells disclosed herein or one or more cells modified according to any ofthe methods disclosed herein.

The invention provides a gene product from a cell or the cell line orthe organism or the plant or animal model disclosed herein.

In some embodiment, the amount of gene product expressed is greater thanor less than the amount of gene product from a cell that does not havealtered expression.

In certain embodiments, the Cas13 protein originates from a species ofthe genus Alistipes, Anaerosalibacter, Bacteroides, Bacteroidetes,Bergeyella, Blautia, Butyrivibrio, Capnocytophaga, Carnobacterium,Chloroflexus, Chryseobacterium, Clostridium, Demequina, Eubacteriaceae,Eubacterium, Flavobacterium, Fusobacterium, Herbinix, Insolitispirillum,Lachnospiraceae, Leptotrichia, Listeria, Myroides, Paludibacter,Phaeodactylibacter, Porphyromonadaceae, Porphyromonas, Prevotella,Pseudobutyrivibrio, Psychroflexus, Reichenbachiella, Rhodobacter,Riemerella, Sinomicrobium, Thalassospira, Ruminococcus. As used herein,when a Cas13 protein originates form a species, it may be the wild typeCas13 protein in the species, or a homolog of the wild type Cas13protein in the species. The Cas13 protein that is a homolog of the wildtype Cas13 protein in the species may comprise one or more variations(e.g., mutations, truncations, etc.) of the wild type Cas13 protein.

In certain embodiments, the Cas13 protein originates from Leptotrichiashahii, Listeria seeligeri, Lachnospiraceae bacterium (such as LbMA2020, Lb NK4A179, Lb NK4A144), Clostridium aminophilum (such as Ca DSM10710), Carnobacterium gallinarum (such as Cg DSM 4847), Paludibacterpropionicigenes (such as Pp WB4), Listeria weihenstephanensis (such asLw FSL R9-0317), Listeriaceae bacterium (such as Lb FSL M6-0635),Leptotrichia wadei (such as Lw F0279), Rhodobacter capsulatus (such asRc SB 1003, Rc R121, Rc DE442), Leptotrichia buccalis (such as LbC-1013-b), Herbinix hemicellulosilytica, Eubacteriaceae bacterium (suchas Eb CHKCI004), Blautia. sp Marseille-P2398, Leptotrichia sp. oraltaxon 879 str. F0557, Chloroflexus aggregans, Demequina aurantiaca,Thalassospira sp. TSL5-1, Pseudobutyrivibrio sp. OR37, Butyrivibrio sp.YAB3001, Leptotrichia sp. Marseille-P3007, Bacteroides ihuae,Porphyromonadaceae bacterium (such as Pb KH3CP3RA), Listeria riparia,Insolitispirillum peregrinum, Alistipes sp. ZOR0009, Bacteroidespyogenes (such as Bp F0041), Bacteroidetes bacterium (such as BbGWA2_31_9), Bergeyella zoohelcum (such as Bz ATCC 43767), Capnocytophagacanimorsus, Capnocytophaga cynodegmi, Chryseobacterium carnipullorum,Chryseobacterium jejuense, Chryseobacterium ureilyticum, Flavobacteriumbranchiophilum, Flavobacterium columnare, Flavobacterium sp. 316,Myroides odoratimimus (such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG3837), Paludibacter propionicigenes, Phaeodactylibacter xiamenensis,Porphyromonas gingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, PgW4087, Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60, Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, Sinomicrobium oceani,Fusobacterium necrophorum (such as Fn subsp. funduliforme ATCC 51357, FnDJ-2, Fn BFTR-1, Fn subsp. funduliforme), Fusobacterium perfoetens (suchas Fp ATCC 29250), Fusobacterium ulcerans (such as Fu ATCC 49185),Anaerosalibacter sp. ND1, Eubacterium siraeum, Ruminococcus flavefaciens(such as Rfx XPD3002), or Ruminococcus albus.

In certain embodiments, the Cas13 is Cas13a and originates from aspecies of the genus Bacteroides, Blautia, Butyrivibrio, Carnobacterium,Chloroflexus, Clostridium, Demequina, Eubacterium, Herbinix,Insolitispirillum, Lachnospiraceae, Leptotrichia, Listeria,Paludibacter, Porphyromonadaceae, Pseudobutyrivibrio, Rhodobacter, orThalassospira.

In certain embodiments, the Cas13 is Cas13a and originates fromLeptotrichia shahii, Listeria seeligeri, Lachnospiraceae bacterium (suchas Lb MA2020, Lb NK4A179, Lb NK4A144), Clostridium aminophilum (such asCa DSM 10710), Carnobacterium gallinarum (such as Cg DSM 4847),Paludibacter propionicigenes (such as Pp WB4), Listeriaweihenstephanensis (such as Lw FSL R9-0317), Listeriaceae bacterium(such as Lb FSL M6-0635), Leptotrichia wadei (such as Lw F0279),Rhodobacter capsulatus (such as Rc SB 1003, Rc R121, Rc DE442),Leptotrichia buccalis (such as Lb C-1013-b), Herbinixhemicellulosilytica, Eubacteriaceae bacterium (such as Eb CHKCI004),Blautia. sp Marseille-P2398, Leptotrichia sp. oral taxon 879 str. F0557,Chloroflexus aggregans, Demequina aurantiaca, Thalassospira sp. TSL5-1,Pseudobutyrivibrio sp. OR37, Butyrivibrio sp. YAB3001, Leptotrichia sp.Marseille-P3007, Bacteroides ihuae, Porphyromonadaceae bacterium (suchas Pb KH3CP3RA), Listeria riparia, or Insolitispirillum peregrinum.

In certain embodiments, the Cas13 is Cas13b and originates from aspecies of the genus Alistipes, Bacteroides, Bacteroidetes, Bergeyella,Capnocytophaga, Chryseobacterium, Flavobacterium, Myroides,Paludibacter, Phaeodactylibacter, Porphyromonas, Prevotella,Psychroflexus, Reichenbachiella, Riemerella, or Sinomicrobium.

In certain embodiments, the Cas13 is Cas13b and originates fromAlistipes sp. ZOR0009, Bacteroides pyogenes (such as Bp F0041),Bacteroidetes bacterium (such as Bb GWA2_31_9), Bergeyella zoohelcum(such as Bz ATCC 43767), Capnocytophaga canimorsus, Capnocytophagacynodegmi, Chryseobacterium carnipullorum, Chryseobacterium jejuense,Chryseobacterium ureilyticum, Flavobacterium branchiophilum,Flavobacterium columnare, Flavobacterium sp. 316, Myroides odoratimimus(such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG 3837), Paludibacterpropionicigenes, Phaeodactylibacter xiamenensis, Porphyromonasgingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, Pg W4087,Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60, Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, or Sinomicrobium oceani. Insome examples, the Cas13 is Riemerella anatipestifer Cas13b. In someexamples, when the Cas13 is a dead Riemerella anatipestifer Cas13. Insome examples, the Cas13 is Prevotella sp. P5-125. In some examples, theCas13 is a dead Prevotella sp. P5-125.

In certain embodiments, the Cas13 is Cas13c and originates from aspecies of the genus Fusobacterium or Anaerosalibacter.

In certain embodiments, the Cas13 is Cas13c and originates fromFusobacterium necrophorum (such as Fn subsp. funduliforme ATCC 51357, FnDJ-2, Fn BFTR-1, Fn subsp. funduliforme), Fusobacterium perfoetens (suchas Fp ATCC 29250), Fusobacterium ulcerans (such as Fu ATCC 49185), orAnaerosalibacter sp. ND1.

In certain embodiments, the Cas13 is Cas13d and originates from aspecies of the genus Eubacterium or Ruminococcus.

In certain embodiments, the Cas13 is Cas13d and originates fromEubacterium siraeum, Ruminococcus flavefaciens (such as Rfx XPD3002), orRuminococcus albus.

In certain embodiments, the invention provides an isolated Cas13effector protein, comprising or consisting essentially of or consistingof or as set forth in Tables 1-4, and comprising one or more mutation asdescribed herein elsewhere. A Tables 1-4 Cas13 effector protein is asdiscussed in more detail herein in conjunction with Tables 1-4. Theinvention provides an isolated nucleic acid encoding the Cas13 effectorprotein. In some embodiments of the invention the isolated nucleic acidcomprises DNA sequence and further comprises a sequence encoding acrRNA. The invention provides an isolated eukaryotic cell comprising thenucleic acid encoding the Cas13 effector protein. Thus, herein, “Cas13effector protein” or “effector protein” or “Cas” or “Cas protein” or“RNA targeting effector protein” or “RNA targeting protein” or likeexpressions is to be understood as including Cas13a, Cas13b, Cas13c, orCas13d; expressions such as “RNA targeting CRISPR system” are to beunderstood as including Cas13a, Cas13b, Cas13c, or Cas13d CRISPRsystems, and in certain embodiments can be read as a Tables 1-4 Cas13effector protein CRISPR system; and references to guide RNA or sgRNA areto be read in conjunction with the herein-discussion of the Cas13 systemcrRNA, e.g., that which is sgRNA in other systems may be considered asor akin to crRNA in the instant invention.

The invention provides a method of identifying the requirements of asuitable guide sequence for the Cas13 effector protein of the invention(e.g., Tables 1-4), said method comprising:

(a) selecting a set of essential genes within an organism

(b) designing a library of targeting guide sequences capable ofhybridizing to regions the coding regions of these genes as well as 5′and 3′ UTRs of these genes

(c) generating randomized guide sequences that do not hybridize to anyregion within the genome of said organism as control guides

(d) preparing a plasmid comprising the RNA-targeting protein and a firstresistance gene and a guide plasmid library comprising said library oftargeting guides and said control guides and a second resistance gene,

(e) co-introducing said plasmids into a host cell

(f) introducing said host cells on a selective medium for said first andsecond resistance genes

(g) sequencing essential genes of growing host cells

(h) determining significance of depletion of cells transformed withtargeting guides by comparing depletion of cells with control guides;and

(i) determining based on the depleted guide sequences the requirementsof a suitable guide sequence.

In one aspect of such method, determining the PFS sequence for suitableguide sequence of the RNA-targeting protein is by comparison ofsequences targeted by guides in depleted cells. In one aspect of suchmethod, the method further comprises comparing the guide abundance forthe different conditions in different replicate experiments. In oneaspect of such method, the control guides are selected in that they aredetermined to show limited deviation in guide depletion in replicateexperiments. In one aspect of such method, the significance of depletionis determined as (a) a depletion which is more than the most depletedcontrol guide; or (b) a depletion which is more than the averagedepletion plus two times the standard deviation for the control guides.In one aspect of such method, the host cell is a bacterial host cell. Inone aspect of such method, the step of co-introducing the plasmids is byelectroporation and the host cell is an electro-competent host cell.

The invention provides a method of modifying sequences associated withor at a target locus of interest, the method comprising delivering tosaid locus a non-naturally occurring or engineered compositioncomprising a Cas13 effector protein and one or more nucleic acidcomponents, wherein the effector protein forms a complex with the one ormore nucleic acid components and upon binding of the said complex to thelocus of interest the effector protein induces the modification of thesequences associated with or at the target locus of interest. In apreferred embodiment, the modification is the introduction of a strandbreak. In a preferred embodiment, the sequences associated with or atthe target locus of interest comprises RNA or consists of RNA.

The invention provides a method of modifying sequences associated withor at a target locus of interest, the method comprising delivering tosaid locus a non-naturally occurring or engineered compositioncomprising a Cas13 effector protein, optionally a small accessoryprotein, and one or more nucleic acid components, wherein the effectorprotein forms a complex with the one or more nucleic acid components andupon binding of the said complex to the locus of interest the effectorprotein induces the modification of the sequences associated with or atthe target locus of interest. In a preferred embodiment, themodification is the introduction of a strand break. In a preferredembodiment, the sequences associated with or at the target locus ofinterest comprises RNA or consists of RNA.

The invention provides a method of modifying sequences associated withor at a target locus of interest, the method comprising delivering tosaid sequences associated with or at the locus a non-naturally occurringor engineered composition comprising a Cas13 loci effector protein andone or more nucleic acid components, wherein the Cas13 effector proteinforms a complex with the one or more nucleic acid components and uponbinding of the said complex to the locus of interest the effectorprotein induces the modification of sequences associated with or at thetarget locus of interest. In a preferred embodiment, the modification isthe introduction of a strand break. In a preferred embodiment the Cas13effector protein forms a complex with one nucleic acid component;advantageously an engineered or non-naturally occurring nucleic acidcomponent. The induction of modification of sequences associated with orat the target locus of interest can be Cas13 effector protein-nucleicacid guided. In a preferred embodiment the one nucleic acid component isa CRISPR RNA (crRNA). In a preferred embodiment the one nucleic acidcomponent is a mature crRNA or guide RNA, wherein the mature crRNA orguide RNA comprises a spacer sequence (or guide sequence) and a directrepeat (DR) sequence or derivatives thereof. In a preferred embodimentthe spacer sequence or the derivative thereof comprises a seed sequence,wherein the seed sequence is critical for recognition and/orhybridization to the sequence at the target locus. In a preferredembodiment of the invention the crRNA is a short crRNA that may beassociated with a short DR sequence. In another embodiment of theinvention the crRNA is a long crRNA that may be associated with a longDR sequence (or dual DR). Aspects of the invention relate to Cas13effector protein complexes having one or more non-naturally occurring orengineered or modified or optimized nucleic acid components. In apreferred embodiment the nucleic acid component comprises RNA. In apreferred embodiment the nucleic acid component of the complex maycomprise a guide sequence linked to a direct repeat sequence, whereinthe direct repeat sequence comprises one or more stem loops or optimizedsecondary structures. In preferred embodiments of the invention, thedirect repeat may be a short DR or a long DR (dual DR). In a preferredembodiment the direct repeat may be modified to comprise one or moreprotein-binding RNA aptamers. In a preferred embodiment, one or moreaptamers may be included such as part of optimized secondary structure.Such aptamers may be capable of binding a bacteriophage coat protein.The bacteriophage coat protein may be selected from the group comprisingQβ, F2, GA, fr, JP501, MS2, M12, R17, BZ13, JP34, JP500, KU1, M11, MX1,TW18, VK, SP, FI, ID2, NL95, TW19, AP205, ϕCb5, ϕCb8r, ϕCb12r, ϕCb23r,7s and PRR1. In a preferred embodiment the bacteriophage coat protein isMS2. The invention also provides for the nucleic acid component of thecomplex being 30 or more, 40 or more or 50 or more nucleotides inlength.

The invention provides methods of genome editing or modifying sequencesassociated with or at a target locus of interest wherein the methodcomprises introducing a Cas13 complex into any desired cell type,prokaryotic or eukaryotic cell, whereby the Cas13 effector proteincomplex effectively functions to interfere with RNA in the eukaryotic orprokaryotic cell. In preferred embodiments, the cell is a eukaryoticcell and the RNA is transcribed from a mammalian genome or is present ina mammalian cell. In preferred methods of RNA editing or genome editingin human cells, the Cas13 effector proteins may include but are notlimited to the specific species of Cas13 effector proteins disclosedherein.

The invention also provides a method of modifying a target locus ofinterest, the method comprising delivering to said locus a non-naturallyoccurring or engineered composition comprising a Cas13 effector proteinand one or more nucleic acid components, wherein the Cas13 effectorprotein forms a complex with the one or more nucleic acid components andupon binding of the said complex to the locus of interest the effectorprotein induces the modification of the target locus of interest. In apreferred embodiment, the modification is the introduction of a strandbreak.

In such methods the target locus of interest may be comprised within aRNA molecule. In such methods the target locus of interest may becomprised in a RNA molecule in vitro.

In such methods the target locus of interest may be comprised in a RNAmolecule within a cell. The cell may be a prokaryotic cell or aeukaryotic cell. The cell may be a mammalian cell. The modificationintroduced to the cell by the present invention may be such that thecell and progeny of the cell are altered for improved production ofbiologic products such as an antibody, starch, alcohol or other desiredcellular output. The modification introduced to the cell by the presentinvention may be such that the cell and progeny of the cell include analteration that changes the biologic product produced.

The mammalian cell many be a non-human mammal, e.g., primate, bovine,ovine, porcine, canine, rodent, Leporidae such as monkey, cow, sheep,pig, dog, rabbit, rat or mouse cell. The cell may be a non-mammalianeukaryotic cell such as poultry bird (e.g., chicken), vertebrate fish(e.g., salmon) or shellfish (e.g., oyster, claim, lobster, shrimp) cell.The cell may also be a plant cell. The plant cell may be of a monocot ordicot or of a crop or grain plant such as cassava, corn, sorghum,soybean, wheat, oat or rice. The plant cell may also be of an algae,tree or production plant, fruit or vegetable (e.g., trees such as citrustrees, e.g., orange, grapefruit or lemon trees; peach or nectarinetrees; apple or pear trees; nut trees such as almond or walnut orpistachio trees; nightshade plants; plants of the genus Brassica; plantsof the genus Lectica; plants of the genus Spinalis; plants of the genusCapsicum; cotton, tobacco, asparagus, carrot, cabbage, broccoli,cauliflower, tomato, eggplant, pepper, lettuce, spinach, strawberry,blueberry, raspberry, blackberry, grape, coffee, cocoa, etc).

The invention provides a method of modifying a target locus of interest,the method comprising delivering to said locus a non-naturally occurringor engineered composition comprising a Cas13 effector protein and one ormore nucleic acid components, wherein the effector protein forms acomplex with the one or more nucleic acid components and upon binding ofthe said complex to the locus of interest the effector protein inducesthe modification of the target locus of interest. In a preferredembodiment, the modification is the introduction of a strand break.

In such methods the target locus of interest may be comprised within anRNA molecule. In a preferred embodiment, the target locus of interestcomprises or consists of RNA.

The invention also provides a method of modifying a target locus ofinterest, the method comprising delivering to said locus a non-naturallyoccurring or engineered composition comprising a Cas13 effector proteinand one or more nucleic acid components, wherein the Cas13 effectorprotein forms a complex with the one or more nucleic acid components andupon binding of the said complex to the locus of interest the effectorprotein induces the modification of the target locus of interest. In apreferred embodiment, the modification is the introduction of a strandbreak.

Preferably, in such methods the target locus of interest may becomprised in a RNA molecule in vitro. Also preferably, in such methodsthe target locus of interest may be comprised in a RNA molecule within acell. The cell may be a prokaryotic cell or a eukaryotic cell. The cellmay be a mammalian cell. The cell may be a rodent cell. The cell may bea mouse cell.

In any of the described methods the target locus of interest may be agenomic or epigenomic locus of interest. In any of the described methodsthe complex may be delivered with multiple guides for multiplexed use.In any of the described methods more than one protein(s) may be used.

In further aspects of the invention the nucleic acid components maycomprise a CRISPR RNA (crRNA) sequence. As the effector protein is aCas13 effector protein, the nucleic acid components may comprise aCRISPR RNA (crRNA) sequence and generally may not comprise anytrans-activating crRNA (tracr RNA) sequence.

In any of the described methods the effector protein and nucleic acidcomponents may be provided via one or more polynucleotide moleculesencoding the protein and/or nucleic acid component(s), and wherein theone or more polynucleotide molecules are operably configured to expressthe protein and/or the nucleic acid component(s). The one or morepolynucleotide molecules may comprise one or more regulatory elementsoperably configured to express the protein and/or the nucleic acidcomponent(s). The one or more polynucleotide molecules may be comprisedwithin one or more vectors. In any of the described methods the targetlocus of interest may be a genomic, epigenomic, or transcriptomic locusof interest. In any of the described methods the complex may bedelivered with multiple guides for multiplexed use. In any of thedescribed methods more than one protein(s) may be used.

In any of the described methods the strand break may be a single strandbreak or a double strand break. In preferred embodiments the doublestrand break may refer to the breakage of two sections of RNA, such asthe two sections of RNA formed when a single strand RNA molecule hasfolded onto itself or putative double helices that are formed with anRNA molecule which contains self-complementary sequences allows parts ofthe RNA to fold and pair with itself.

Regulatory elements may comprise inducible promotors. Polynucleotidesand/or vector systems may comprise inducible systems.

In any of the described methods the one or more polynucleotide moleculesmay be comprised in a delivery system, or the one or more vectors may becomprised in a delivery system.

In any of the described methods the non-naturally occurring orengineered composition may be delivered via liposomes, particlesincluding nanoparticles, exosomes, microvesicles, a gene-gun or one ormore viral vectors.

The invention also provides a non-naturally occurring or engineeredcomposition which is a composition having the characteristics asdiscussed herein or defined in any of the herein described methods.

In certain embodiments, the invention thus provides a non-naturallyoccurring or engineered composition, such as particularly a compositioncapable of or configured to modify a target locus of interest, saidcomposition comprising a Cas13 effector protein and one or more nucleicacid components, wherein the effector protein forms a complex with theone or more nucleic acid components and upon binding of the said complexto the locus of interest the effector protein induces the modificationof the target locus of interest. In certain embodiments, the effectorprotein may be a Cas13a, Cas13b, Cas13c, or Cas13d effector protein,preferably a Cas13b effector protein.

The invention also provides in a further aspect a non-naturallyoccurring or engineered composition, such as particularly a compositioncapable of or configured to modify a target locus of interest, saidcomposition comprising: (a) a guide RNA molecule (or a combination ofguide RNA molecules, e.g., a first guide RNA molecule and a second guideRNA molecule) or a nucleic acid encoding the guide RNA molecule (or oneor more nucleic acids encoding the combination of guide RNA molecules);(b) a Cas13 effector protein. In certain embodiments, the effectorprotein may be a Cas13b effector protein.

The invention also provides in a further aspect a non-naturallyoccurring or engineered composition comprising: (I.) one or moreCRISPR-Cas system polynucleotide sequences comprising (a) a guidesequence capable of hybridizing to a target sequence in a polynucleotidelocus, (b) a tracr mate (i.e. direct repeat) sequence, and (II.) asecond polynucleotide sequence encoding a Cas13 effector protein,wherein when transcribed, the guide sequence directs sequence-specificbinding of a CRISPR complex to the target sequence, and wherein theCRISPR complex comprises the Cas13 effector protein complexed with theguide sequence that is hybridized to the target sequence. In certainembodiments, the effector protein may be a Cas13b effector protein.

In certain embodiments, a tracrRNA may not be required. Hence, theinvention also provides in certain embodiments a non-naturally occurringor engineered composition comprising: (I.) one or more CRISPR-Cas systempolynucleotide sequences comprising (a) a guide sequence capable ofhybridizing to a target sequence in a polynucleotide locus, and (b) adirect repeat sequence, and (II.) a second polynucleotide sequenceencoding a Cas13 effector protein, wherein when transcribed, the guidesequence directs sequence-specific binding of a CRISPR complex to thetarget sequence, and wherein the CRISPR complex comprises the Cas13effector protein complexed with (1) the guide sequence that ishybridized to the target sequence, and (2) the direct repeat sequence.Preferably, the effector protein may be a Cas13b effector protein.Without limitation, the Applicants hypothesize that in such instances,the direct repeat sequence may comprise secondary structure that issufficient for crRNA loading onto the effector protein. By means ofexample and not limitation, such secondary structure may comprise,consist essentially of or consist of a stem loop (such as one or morestem loops) within the direct repeat.

The invention also provides a vector system comprising one or morevectors, the one or more vectors comprising one or more polynucleotidemolecules encoding components of a non-naturally occurring or engineeredcomposition which is a composition having the characteristics as definedin any of the herein described methods.

The invention also provides a delivery system comprising one or morevectors or one or more polynucleotide molecules, the one or more vectorsor polynucleotide molecules comprising one or more polynucleotidemolecules encoding components of a non-naturally occurring or engineeredcomposition which is a composition having the characteristics discussedherein or as defined in any of the herein described methods.

The invention also provides a non-naturally occurring or engineeredcomposition, or one or more polynucleotides encoding components of saidcomposition, or vector or delivery systems comprising one or morepolynucleotides encoding components of said composition for use in atherapeutic method of treatment. The therapeutic method of treatment maycomprise gene or genome editing, or gene therapy.

The invention also provides for methods and compositions wherein one ormore amino acid residues of the effector protein may be modified e.g.,an engineered or non-naturally-occurring Cas13 effector protein of orcomprising or consisting or consisting essentially a Tables 1-4 protein.In an embodiment, the modification may comprise mutation of one or moreamino acid residues of the effector protein. The one or more mutationsmay be in one or more catalytically active domains of the effectorprotein. The effector protein may have reduced or abolished nucleaseactivity compared with an effector protein lacking said one or moremutations. The effector protein may not direct cleavage of one RNAstrand at the target locus of interest. In a preferred embodiment, theone or more mutations may comprise two mutations. In a preferredembodiment the one or more amino acid residues are modified in the Cas13effector protein, e.g., an engineered or non-naturally-occurring Cas13effector protein. In certain embodiments of the invention the effectorprotein comprises one or more HEPN domains. In a preferred embodiment,the effector protein comprises two HEPN domains. In another preferredembodiment, the effector protein comprises one HEPN domain at theC-terminus and another HEPN domain at the N-terminus of the protein. Incertain embodiments, the one or more mutations or the two or moremutations may be in a catalytically active domain of the effectorprotein comprising a HEPN domain, or a catalytically active domain whichis homologous to a HEPN domain. In certain embodiments, the effectorprotein comprises one or more of the following mutations: R116A, H121A,R1177A, H1182A (wherein amino acid positions correspond to amino acidpositions of Group 29 protein originating from Bergeyella zoohelcum ATCC43767). The skilled person will understand that corresponding amino acidpositions in different Cas13 proteins may be mutated to the same effect.In certain embodiments, one or more mutations abolish catalytic activityof the protein completely or partially (e.g. altered cleavage rate,altered specificity, etc.) In certain embodiments, the effector proteinas described herein is a “dead” effector protein, such as a dead Cas13effector protein (i.e. dCas13b). In certain embodiments, the effectorprotein has one or more mutations in HEPN domain 1. In certainembodiments, the effector protein has one or more mutations in HEPNdomain 2. In certain embodiments, the effector protein has one or moremutations in HEPN domain 1 and HEPN domain 2. The effector protein maycomprise one or more heterologous functional domains. The one or moreheterologous functional domains may comprise one or more nuclearlocalization signal (NLS) domains. The one or more heterologousfunctional domains may comprise at least two or more NLS domains. Theone or more NLS domain(s) may be positioned at or near or in proximityto a terminus of the effector protein (e.g., Cas13b effector protein)and if two or more NLSs, each of the two may be positioned at or near orin proximity to a terminus of the effector protein (e.g., Cas13 effectorprotein). The one or more heterologous functional domains may compriseone or more transcriptional activation domains. In a preferredembodiment the transcriptional activation domain may comprise VP64. Theone or more heterologous functional domains may comprise one or moretranscriptional repression domains. In a preferred embodiment thetranscriptional repression domain comprises a KRAB domain or a SIDdomain (e.g. SID4X). The one or more heterologous functional domains maycomprise one or more nuclease domains. In a preferred embodiment anuclease domain comprises FokI.

The invention also provides for the one or more heterologous functionaldomains to have one or more of the following activities: methylaseactivity, demethylase activity, transcription activation activity,transcription repression activity, transcription release factoractivity, histone modification activity, nuclease activity,single-strand RNA cleavage activity, double-strand RNA cleavageactivity, single-strand DNA cleavage activity, double-strand DNAcleavage activity and nucleic acid binding activity. At least one ormore heterologous functional domains may be at or near theamino-terminus of the effector protein and/or wherein at least one ormore heterologous functional domains is at or near the carboxy-terminusof the effector protein. The one or more heterologous functional domainsmay be fused to the effector protein. The one or more heterologousfunctional domains may be tethered to the effector protein. The one ormore heterologous functional domains may be linked to the effectorprotein by a linker moiety.

In certain embodiments, the Cas13 effector proteins as intended hereinmay be associated with a locus comprising short CRISPR repeats between30 and 40 bp long, more typically between 34 and 38 bp long, even moretypically between 36 and 37 bp long, e.g., 30, 31, 32, 33, 34, 35, 36,37, 38, 39, or 40 bp long. In certain embodiments the CRISPR repeats arelong or dual repeats between 80 and 350 bp long such as between 80 and200 bp long, even more typically between 86 and 88 bp long, e.g., 80,81, 82, 83, 84, 85, 86, 87, 88, 89, or 90 bp long

In certain embodiments, a protospacer adjacent motif (PAM) or PAM-likemotif directs binding of the effector protein (e.g. a Cas13 effectorprotein) complex as disclosed herein to the target locus of interest. Insome embodiments, the PAM may be a 5′ PAM (i.e., located upstream of the5′ end of the protospacer). In other embodiments, the PAM may be a 3′PAM (i.e., located downstream of the 5′ end of the protospacer). Inother embodiments, both a 5′ PAM and a 3′ PAM are required. In certainembodiments of the invention, a PAM or PAM-like motif may not berequired for directing binding of the effector protein (e.g. a Cas13effector protein). In certain embodiments, a 5′ PAM is D (e.g., A, G, orU). In certain embodiments, a 5′ PAM is D for Cas13b effectors. Incertain embodiments of the invention, cleavage at repeat sequences maygenerate crRNAs (e.g. short or long crRNAs) containing a full spacersequence flanked by a short nucleotide (e.g. 5, 6, 7, 8, 9, or 10 nt orlonger if it is a dual repeat) repeat sequence at the 5′ end (this maybe referred to as a crRNA “tag”) and the rest of the repeat at the 3′end. In certain embodiments, targeting by the effector proteinsdescribed herein may require the lack of homology between the crRNA tagand the target 5′ flanking sequence. This requirement may be similar tothat described further in Samai et al. “Co-transcriptional DNA and RNACleavage during Type III CRISPR-Cas Immunity” Cell 161, 1164-1174, May21, 2015, where the requirement is thought to distinguish between bonafide targets on invading nucleic acids from the CRISPR array itself, andwhere the presence of repeat sequences will lead to full homology withthe crRNA tag and prevent autoimmunity.

In certain embodiments, Cas13 effector protein is engineered and cancomprise one or more mutations that reduce or eliminate nucleaseactivity, thereby reducing or eliminating RNA interfering activity.Mutations can also be made at neighboring residues, e.g., at amino acidsnear those that participate in the nuclease activity. In someembodiments, one or more putative catalytic nuclease domains areinactivated and the effector protein complex lacks cleavage activity andfunctions as an RNA binding complex. In a preferred embodiment, theresulting RNA binding complex may be linked with one or more functionaldomains as described herein.

In certain embodiments, the one or more functional domains arecontrollable, i.e. inducible.

In certain embodiments of the invention, the guide RNA or mature crRNAcomprises, consists essentially of, or consists of a direct repeatsequence and a guide sequence or spacer sequence. In certainembodiments, the guide RNA or mature crRNA comprises, consistsessentially of, or consists of a direct repeat sequence linked to aguide sequence or spacer sequence. In preferred embodiments of theinvention, the mature crRNA comprises a stem loop or an optimized stemloop structure or an optimized secondary structure. In preferredembodiments the mature crRNA comprises a stem loop or an optimized stemloop structure in the direct repeat sequence, wherein the stem loop oroptimized stem loop structure is important for cleavage activity. Incertain embodiments, the mature crRNA preferably comprises a single stemloop. In certain embodiments, the direct repeat sequence preferablycomprises a single stem loop. In certain embodiments, the cleavageactivity of the effector protein complex is modified by introducingmutations that affect the stem loop RNA duplex structure. In preferredembodiments, mutations which maintain the RNA duplex of the stem loopmay be introduced, whereby the cleavage activity of the effector proteincomplex is maintained. In other preferred embodiments, mutations whichdisrupt the RNA duplex structure of the stem loop may be introduced,whereby the cleavage activity of the effector protein complex iscompletely abolished.

The CRISPR system as provided herein can make use of a crRNA oranalogous polynucleotide comprising a guide sequence, wherein thepolynucleotide is an RNA, a DNA or a mixture of RNA and DNA, and/orwherein the polynucleotide comprises one or more nucleotide analogs. Thesequence can comprise any structure, including but not limited to astructure of a native crRNA, such as a bulge, a hairpin or a stem loopstructure. In certain embodiments, the polynucleotide comprising theguide sequence forms a duplex with a second polynucleotide sequencewhich can be an RNA or a DNA sequence.

The present disclosure also provides cells, tissues, organismscomprising the engineered CRISPR-Cas protein, the CRISPR-Cas systems,the polynucleotides encoding one or more components of the CRISPR-Cassystems, and/or vectors comprising the polynucleotides. The inventionalso provides for the nucleotide sequence encoding the effector proteinbeing codon optimized for expression in a eukaryote or eukaryotic cellin any of the herein described methods or compositions. In an embodimentof the invention, the codon optimized effector protein is any Cas13effector protein discussed herein and is codon optimized for operabilityin a eukaryotic cell or organism, e.g., such cell or organism aselsewhere herein mentioned, for instance, without limitation, a yeastcell, or a mammalian cell or organism, including a mouse cell, a ratcell, and a human cell or non-human eukaryote organism, e.g., plant.

In certain embodiments of the invention, at least one nuclearlocalization signal (NLS) is attached to the nucleic acid sequencesencoding the Cas13 effector proteins. In preferred embodiments at leastone or more C-terminal or N-terminal NLSs are attached (and hencenucleic acid molecule(s) coding for the Cas13 effector protein caninclude coding for NLS(s) so that the expressed product has the NLS(s)attached or connected). In a preferred embodiment a C-terminal NLS isattached for optimal expression and nuclear targeting in eukaryoticcells, preferably human cells. The invention also encompasses methodsfor delivering multiple nucleic acid components, wherein each nucleicacid component is specific for a different target locus of interestthereby modifying multiple target loci of interest. The nucleic acidcomponent of the complex may comprise one or more protein-binding RNAaptamers. The one or more aptamers may be capable of binding abacteriophage coat protein.

In a further aspect, the invention provides a eukaryotic cell comprisinga modified target locus of interest, wherein the target locus ofinterest has been modified according to in any of the herein describedmethods. A further aspect provides a cell line of said cell. Anotheraspect provides a multicellular organism comprising one or more saidcells.

In certain embodiments, the modification of the target locus of interestmay result in: the eukaryotic cell comprising altered expression of atleast one gene product; the eukaryotic cell comprising alteredexpression of at least one gene product, wherein the expression of theat least one gene product is increased; the eukaryotic cell comprisingaltered expression of at least one gene product, wherein the expressionof the at least one gene product is decreased; or the eukaryotic cellcomprising an edited genome.

In certain embodiments, the eukaryotic cell may be a mammalian cell or ahuman cell.

In further embodiments, the non-naturally occurring or engineeredcompositions, the vector systems, or the delivery systems as describedin the present specification may be used for: site-specific geneknockout; site-specific genome editing; RNA sequence-specificinterference; or multiplexed genome engineering.

Also provided is a gene product from the cell, the cell line, or theorganism as described herein. In certain embodiments, the amount of geneproduct expressed may be greater than or less than the amount of geneproduct from a cell that does not have altered expression or editedgenome. In certain embodiments, the gene product may be altered incomparison with the gene product from a cell that does not have alteredexpression or edited genome.

In another aspect, the invention provides a method for identifying novelnucleic acid modifying effectors, comprising: identifying putativenucleic acid modifying loci from a set of nucleic acid sequencesencoding the putative nucleic acid modifying enzyme loci that are withina defined distance from a conserved genomic element of the loci, thatcomprise at least one protein above a defined size limit, or both;grouping the identified putative nucleic acid modifying loci intosubsets comprising homologous proteins; identifying a final set ofcandidate nucleic acid modifying loci by selecting nucleic acidmodifying loci from one or more subsets based on one or more of thefollowing; subsets comprising loci with putative effector proteins withlow domain homology matches to known protein domains relative to loci inother subsets, subsets comprising putative proteins with minimaldistances to the conserved genomic element relative to loci in othersubsets, subsets with loci comprising large effector proteins having asame orientations as putative adjacent accessory proteins relative tolarge effector proteins in other subsets, subset comprising putativeeffector proteins with lower existing nucleic acid modifyingclassifications relative to other loci, subsets comprising loci with alower proximity to known nucleic acid modifying loci relative to othersubsets, and total number of candidate loci in each subset.

In one embodiment, the set of nucleic acid sequences is obtained from agenomic or metagenomic database, such as a genomic or metagenomicdatabase comprising prokaryotic genomic or metagenomic sequences.

In one embodiment, the defined distance from the conserved genomicelement is between 1 kb and 25 kb.

In one embodiment, the conserved genomic element comprises a repetitiveelement, such as a CRISPR array. In a specific embodiment, the defineddistance from the conserved genomic element is within 10 kb of theCRISPR array.

In one embodiment, the defined size limit of a protein comprised withinthe putative nucleic acid modifying (effector) locus is greater than 200amino acids, or more particularly, the defined size limit is greaterthan 700 amino acids. In one embodiment, the putative nucleic acidmodifying locus is between 900 to 1800 amino acids.

In one embodiment, the conserved genomic elements are identified using arepeat or pattern finding analysis of the set of nucleic acids, such asPILER-CR.

In one embodiment, the grouping step of the method described herein isbased, at least in part, on results of a domain homology search or anHHpred protein domain homology search.

In one embodiment, the defined threshold is a BLAST nearest-neighborcut-off value of 0 to le-7.

In one embodiment, the method described herein further comprises afiltering step that includes only loci with putative proteins between900 and 1800 amino acids.

In one embodiment, the method described herein further comprisesexperimental validation of the nucleic acid modifying function of thecandidate nucleic acid modifying effectors comprising generating a setof nucleic acid constructs encoding the nucleic acid modifying effectorsand performing one or more biochemical validation assays, such asthrough the use of PAM validation in bacterial colonies, in vitrocleavage assays, the Surveyor method, experiments in mammalian cells,PFS validation, or a combination thereof.

In one embodiment, the method described herein further comprisespreparing a non-naturally occurring or engineered composition comprisingone or more proteins from the identified nucleic acid modifying loci.

In one embodiment, the identified loci comprise a Class 2 CRISPReffector, or the identified loci lack Cas1 or Cas2, or the identifiedloci comprise a single effector.

In one embodiment, the single large effector protein is greater than900, or greater than 1100 amino acids in length, or comprises at leastone HEPN domain.

In one embodiment, the at least one HEPN domain is near a N- orC-terminus of the effector protein, or is located in an interiorposition of the effector protein.

In one embodiment, the single large effector protein comprises a HEPNdomain at the N- and C-terminus and two HEPN domains internal to theprotein.

In one embodiment, the identified loci further comprise one or two smallputative accessory proteins within 2 kb to 10 kb of the CRISPR array.

In one embodiment, a small accessory protein is less than 700 aminoacids. In one embodiment, the small accessory protein is from 50 to 300amino acids in length.

In one embodiment, the small accessory protein comprises multiplepredicted transmembrane domains, or comprises four predictedtransmembrane domains, or comprises at least one HEPN domain.

In one embodiment, the small accessory protein comprises at least oneHEPN domain and at least one transmembrane domain.

In one embodiment, the loci comprise no additional proteins out to 25 kbfrom the CRISPR array.

In one embodiment, the CRISPR array comprises direct repeat sequencescomprising about 36 nucleotides in length. In a specific embodiment, thedirect repeat comprises a GTTG/GUUG at the 5′ end that is reversecomplementary to a CAAC at the 3′ end.

In one embodiment, the CRISPR array comprises spacer sequencescomprising about 30 nucleotides in length.

In one embodiment, the identified loci lack a small accessory protein.

The invention provides a method of identifying novel CRISPR effectors,comprising: a) identifying sequences in a genomic or metagenomicdatabase encoding a CRISPR array; b) identifying one or more OpenReading Frames (ORFs) in said selected sequences within 10 kb of theCRISPR array; c) selecting loci based on the presence of a putativeCRISPR effector protein between 900-1800 amino acids in size, d)selecting loci encoding a putative accessory protein of 50-300 aminoacids; and e) identifying loci encoding a putative CRISPR effector andCRISPR accessory proteins and optionally classifying them based onstructure analysis.

In one embodiment, the CRISPR effector is a Type VI CRISPR effector. Inan embodiment, step (a) comprises i) comparing sequences in a genomicand/or metagenomic database with at least one pre-identified seedsequence that encodes a CRISPR array, and selecting sequences comprisingsaid seed sequence; or ii) identifying CRISPR arrays based on a CRISPRalgorithm.

In an embodiment, step (d) comprises identifying nuclease domains. In anembodiment, step (d) comprises identifying RuvC, HPN, and/or HEPNdomains.

In an embodiment, no ORF encoding Cas1 or Cas2 is present within 10 kbof the CRISPR array

In an embodiment, an ORF in step (b) encodes a putative accessoryprotein of 50-300 amino acids.

In an embodiment, putative novel CRISPR effectors obtained in step (d)are used as seed sequences for further comparing genomic and/ormetagenomics sequences and subsequent selecting loci of interest asdescribed in steps a) to d) of claim 1. In an embodiment, thepre-identified seed sequence is obtained by a method comprising: (a)identifying CRISPR motifs in a genomic or metagenomic database, (b)extracting multiple features in said identified CRISPR motifs, (c)classifying the CRISPR loci using unsupervised learning, (d) identifyingconserved locus elements based on said classification, and (e) selectingtherefrom a putative CRISPR effector suitable as seed sequence.

In an embodiment, the features include protein elements, repeatstructure, repeat sequence, spacer sequence and spacer mapping. In anembodiment, the genomic and metagenomic databases are bacterial and/orarchaeal genomes. In an embodiment, the genomic and metagenomicsequences are obtained from the Ensembl and/or NCBI genome databases. Inan embodiment, the structure analysis in step (d) is based on secondarystructure prediction and/or sequence alignments. In an embodiment, step(d) is achieved by clustering of the remaining loci based on theproteins they encode and manual curation of the obtained clusters. nanother aspect, the disclosure provides a mutated Cas13 proteincomprising one or more mutations of amino acids, wherein the aminoacids: interact with a guide RNA that forms a complex with the mutatedCas 13 protein; or are in a HEPN active site, a lid domain which is adomain that caps the 3′ end of the crRNA with two beta hairpins (see,e.g., FIG. 1, FIG. 18), a helical domain, selected from a helical 1 or ahelical 2 domain, an inter-domain linker (IDL) domain, or a bridge helixdomain of the engineered Cas 13 protein. In certain embodiments thehelical domain 1 is helical domain 1-1, 1-2 or 1-3. In embodimentshelical domain 2 is helical domain 2-1 or 2-2. In one aspect, theengineered Cas13 protein has a higher protease activity orpolynucleotide-binding capability compared with a naturally-occurringcounterpart Cas13 protein.

In some embodiments, the Cas13 protein is Cas13a, Cas13b, Cas13c, orCas13d. In some embodiments, the Cas13 protein is Cas13b. In someembodiments, the amino acids interact with the guide RNA that forms acomplex with the mutated Cas 13 protein. In some embodiments, the aminoacids correspond to the following amino acids of Prevotella buccaeCas13b (PbCas13b): T405, H407, K457, H500, K570, K590, N634, R638, N652,N653, K655, S658, K741, K744, N756, S757, R762, R791, K846, K857, K870,and R877. In some embodiments, the amino acids are in a HEPN activesite. In some embodiments, the amino acids correspond to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): amino acids 46-57,73-79, 152-164, 1036-1046, and 1064-1074. In some embodiments, the aminoacids correspond to the following amino acids of Prevotella buccaeCas13b (PbCas13b): R156, N157, H161, R1068, N1069, and H1073. In someembodiments, the amino acids are in the inter-domain linker domain ofthe mutated Cas 13 protein. In some embodiments, the amino acidscorrespond to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R285, R287, K292, K294, E296, and N297. In some embodiments,the amino acids are in the bridge helix domain of the mutated Cas 13protein. In some embodiments, the amino acids correspond to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K826,K828, K829, R824, R830, Q831, K835, K836, and R838.

In another aspect, the disclosure provides a method of altering activityof a Cas13 protein, comprising: identifying one or more candidate aminoacids in the Cas13 protein based on a three-dimensional structure of atleast a portion of the Cas 13 protein, wherein the one or more candidateamino acids interact with a guide RNA that forms a complex with theCas13 protein, or are in a HEPN active site, an inter-domain linkerdomain, or a bridge helix domain of the Cas13 protein; and mutating theone or more candidate amino acids thereby generating a mutated Cas13protein, wherein activity the mutated Cas13 protein is different thanthe Cas13 protein.

In some embodiments, the Cas13 protein is Cas13a, Cas13b, Cas13c, orCas13d. In some embodiments, the Cas13 protein is Cas13b. In someembodiments, the amino acids interact with the guide RNA that forms acomplex with the mutated Cas 13 protein. In some embodiments, the aminoacids correspond to the following amino acids of Prevotella buccaeCas13b (PbCas13b): T405, H407, K457, H500, K570, K590, N634, R638, N652,N653, K655, S658, K741, K744, N756, S757, R762, R791, K846, K857, K870,and R877. In some embodiments, the amino acids are in a HEPN activesite. In some embodiments, the amino acids correspond to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): amino acids 46-57,73-79, 152-164, 1036-1046, and 1064-1074. In some embodiments, the aminoacids correspond to the following amino acids of Prevotella buccaeCas13b (PbCas13b): R156, N157, H161, R1068, N1069, and H1073. In someembodiments, the amino acids are in the inter-domain linker domain ofthe mutated Cas 13 protein. In some embodiments, the amino acidscorrespond to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R285, R287, K292, K294, E296, and N297. In some embodiments,the amino acids are in the bridge helix domain of the mutated Cas 13protein. In some embodiments, the amino acids correspond to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K826,K828, K829, R824, R830, Q831, K835, K836, and R838.

In some embodiments, the Cas13 protein is Cas13b. In some embodiments,the Cas13b is a Cas13 ortholog smaller in size than Cas13 systemsdiscovered to date. In some embodiments, the Cas 13b is Cas13b-t1,Cas13b-t1a, Cas13b-t2, or Cas13b-t3. In some embodiments, the Cas13b isCas13b-t1. In some embodiments, the Cas13b is Cas13b-t1a. In someembodiments, the Cas13b is Cas13b-t2. In some embodiments, the Cas13b isCas13b-t3. CAS13 ORTHOLOGS

The terms “orthologue” (also referred to as “ortholog” herein) and“homologue” (also referred to as “homolog” herein) are well known in theart. By means of further guidance, a “homologue” of a protein as usedherein is a protein of the same species which performs the same or asimilar function as the protein it is a homologue of. Homologousproteins may but need not be structurally related, or are only partiallystructurally related. An “orthologue” of a protein as used herein is aprotein of a different species which performs the same or a similarfunction as the protein it is an orthologue of. Orthologous proteins maybut need not be structurally related, or are only partially structurallyrelated. In particular embodiments, the homologue or orthologue of aCas13 protein as referred to herein has a sequence homology or identityof at least 60%, preferably at least 70%, preferably at least 80%, morepreferably at least 85%, even more preferably at least 90%, such as forinstance at least 95% with a Cas13 effector protein set forth in Tables1-4, below. In a preferred embodiment, the Cas13b effector protein maybe of or from an organism identified in Tables 1-4 or the genus to whichthe organism belongs.

It has been found that a number of Cas13 orthologs are characterized bycommon motifs. Accordingly, in particular embodiments, the Cas13beffector protein is a protein comprising a sequence having at least 70%sequence identity with one or more of the sequences consisting ofDKHXFGAFLNLARHN (SEQ ID NO:96), GLLFFVSLFLDK (SEQ ID NO:97), SKIXGFK(SEQ ID NO:98), DMLNELXRCP (SEQ ID NO:99), RXZDRFPYFALRYXD (SEQ IDNO:100) and LRFQVBLGXY (SEQ ID NO:101). In further particularembodiments, the Cas13b effector protein comprises a sequence having atleast 70% sequence identity at least 2, 3, 4, 5 or all 6 of thesesequences. In further particular embodiments, the sequence identity withthese sequences is at least 75%, 80%, 85%, 90%, 95% or 100%. In furtherparticular embodiments, the Cas13b effector protein is a proteincomprising a sequence having 100% sequence identity with GLLFFVSLFL (SEQID NO:102) and RHQXRFPYF (SEQ ID NO:103). In further particularembodiments, the Cas13b effector is a Cas13b effector protein comprisinga sequence having 100% sequence identity with RHQDRFPY (SEQ ID NO:104).

In particular embodiments, the Cas13b effector protein is a Cas13beffector protein having at least 65%, preferably at least 70%, 75%, 80%,85%, 90%, 95% or more sequence identity with a Cas13b protein fromPrevotella buccae, Porphyromonas gingivalis, Prevotella saccharolytica,Riemerella antipestifer. In further particular embodiments, the Cas13beffector is selected from the Cas13b protein from Bacteroides pyogenes,Prevotella sp. MA2016, Riemerella anatipestifer, Porphyromonas gulae,Porphyromonas gingivalis, and Porphyromonas sp.COT-0520H4946.

It will be appreciated that orthologs of a Table 1 Cas13b enzyme thatcan be within the invention can include a chimeric enzyme comprising afragment of a Table 1 Cas13b enzyme of multiple orthologs. Examples ofsuch orthologs are described elsewhere herein. A chimeric enzyme maycomprise a fragment of a Table 1 Cas13b enzyme and a fragment fromanother CRISPR enzyme, such as an ortholog of a Table 1 Cas13b enzyme ofan organism which includes but is not limited to Bergeyella, Prevotella,Porphyromonas, Bacteroides, Alistipes, Riemerella, Myroides,Flavobacterium, Capnocytophaga, Chryseobacterium, Phaeodactylibacter,Paludibacter or Psychroflexus. A chimeric enzyme can comprise a firstfragment and a second fragment, and the fragments, wherein one of thefirst and second a fragment is of or from a Table 1 Cas13b enzyme andthe other fragment is of or from a CRISPR enzyme ortholog of a differentspecies. In some cases, Cas13b is Cas13b-t. For example, Cas13b may beCas13b-t1 (e.g., Cas13b-t1a), Cas13b-t2, or Cas13b-t3 (see, e.g. FIGS.54A-54C).

In embodiments, the Cas13 RNA-targeting Cas13 effector proteins referredto herein also encompasses a functional variant of the effector proteinor a homologue or an orthologue thereof. A “functional variant” of aprotein as used herein refers to a variant of such protein which retainsat least partially the activity of that protein. Functional variants mayinclude mutants (which may be insertion, deletion, or replacementmutants), including polymorphs, etc., including as discussed herein inconjunction with Table 1. Also included within functional variants arefusion products of such protein with another, usually unrelated, nucleicacid, protein, polypeptide or peptide. Functional variants may benaturally occurring or may be man-made. In an embodiment, nucleic acidmolecule(s) encoding the Cas13 RNA-targeting effector proteins, or anortholog or homolog thereof, may be codon-optimized for expression in aneukaryotic cell. A eukaryote can be as herein discussed. Nucleic acidmolecule(s) can be engineered or non-naturally occurring.

In an embodiment, the Cas13 RNA-targeting effector protein or anortholog or homolog thereof, may comprise one or more mutations. Themutations may be artificially introduced mutations and may include butare not limited to one or more mutations in a catalytic domain, e.g.,one or more mutations are introduced into one or more of the HEPNdomains.

In an embodiment, the Cas13 protein or an ortholog or homolog thereof,may be used as a generic nucleic acid binding protein with fusion to orbeing operably linked to a functional domain. Exemplary functionaldomains may include but are not limited to translational initiator,translational activator, translational repressor, nucleases, inparticular ribonucleases, a spliceosome, beads, a lightinducible/controllable domain or a chemically inducible/controllabledomain.

In an advantageous embodiment, the present invention encompasses Cas13effector proteins with reference to Tables 1-5. In certain exampleembodiments, the Cas13 effector protein is from an organism identifiedin Tables 1-5. In certain example embodiments, the Cas13 effectorprotein is from an organism selected from Bergeyella zoohelcum,Prevotella intermedia, Prevotella buccae, Porphyromonas gingivalis,Bacteroides pyogenes, Alistipes sp. ZOR0009, Prevotella sp. MA2016,Riemerella anatipestifer, Prevotella aurantiaca, Prevotellasaccharolytica, Myroides odoratimimus CCUG 10230, Capnocytophagacanimorsus, Porphyromonas gulae, Prevotella sp. P5-125, Flavobacteriumbranchiophilum, Myroides odoratimimus, Flavobacterium columnare, orPorphyromonas sp. COT-052 OH4946. In another embodiment, the one or moreguide RNAs are designed to bind to one or more target RNA sequences thatare diagnostic for a disease state.

In certain example embodiments, the CRISPR effector protein is a Cas13bprotein selected from Table 1.

TABLE 1 Bergeyella 1 MENKTSLGNNIYYNPFKPQDKSYFAGYFNAAMENTDSVFRELGzoohelcum KRLKGKEYTSENFFDAIFKENISLVEYERYVKLLSDYFPMARLL (SEQ IDDKKEVPIKERKENFKKNFKGIIKAVRDLRNFYTHKEHGEVEITD No. 105)EIFGVLDEMLKSTVLTVKKKKVKTDKTKEILKKSIEKQLDILCQKKLEYLRDTARKIEEKRRNQRERGEKELVAPFKYSDKRDDLIAAIYNDAFDVYIDKKKDSLKESSKAKYNTKSDPQQEEGDLKIPISKNGVVFLLSLFLTKQEIHAFKSKIAGFKATVIDEATVSEATVSHGKNSICFMATHEIFSHLAYKKLKRKVRTAEINYGEAENAEQLSVYAKETLMMQMLDELSKVPDVVYQNLSEDVQKTFIEDWNEYLKENNGDVGTMEEEQVIHPVIRKRYEDKFNYFAIRFLDEFAQFPTLRFQVHLGNYLHDSRPKENLISDRRIKEKITVFGRLSELEHKKALFIKNTETNEDREHYWEIFPNPNYDFPKENISVNDKDFPIAGSILDREKQPVAGKIGIKVKLLNQQYVSEVDKAVKAHQLKQRKASKPSIQNIIEEIVPINESNPKEAIVFGGQPTAYLSMNDIHSILYEFFDKWEKKKEKLEKKGEKELRKEIGKELEKKIVGKIQAQIQQIIDKDTNAKILKPYQDGNSTAIDKEKLIKDLKQEQNILQKLKDEQTVREKEYNDFIAYQDKNREINKVRDRNHKQYLKDNLKRKYPEAPARKEVLYYREKGKVAVWLANDIKRFMPTDFKNEWKGEQHSLLQKSLAYYEQCKEELKNLLPEKVFQHLPFKLGGYFQQKYLYQFYTCYLDKRLEYISGLVQQAENFKSENKVFKKVENECFKFLKKQNYTHKELDARVQSILGYPIFLERGFMDEKPTIIKGKTFKGNEALFADWFRYYKEYQNFQTFYDTENYPLVELEKKQADRKRKTKIYQQKKNDVFTLLMAKHIFKSVFKQDSIDQFSLEDLYQSREERLGNQERARQTGERNTNYIWNKTVDLKLCDGKITVENVKLKNVGDFIKYEYDQRVQAFLKYEENIEWQAFLIKESKEEENYPYVVEREIEQYEKVRREELLKEVHLIEEYILEKVKDKEILKKGDNQNFKYYILNGLLKQLKNEDVESYKVFNLNTEPEDVNINQLKQEATDLEQKAFVLTYIRNKFAHNQLPKKEFWDYCQEKYGKIEKEKTYAEYFAEVFKKEKE ALIK Prevotella 2MEDDKKTTDSIRYELKDKHFWAAFLNLARHNVYITVNHINKIL intermediaEEGEINRDGYETTLKNTWNEIKDINKKDRLSKLIIKHFPFLEAAT (SEQ IDYRLNPTDTTKQKEEKQAEAQSLESLRKSFFVFIYKLRDLRNHYS No. 106)HYKHSKSLERPKFEEGLLEKMYNIFNASIRLVKEDYQYNKDINPDEDFKHLDRTEEEFNYYFTKDNEGNITESGLLFFVSLFLEKKDAIWMQQKLRGFKDNRENKKKMTNEVFCRSRMLLPKLRLQSTQTQDWILLDMLNELIRCPKSLYERLREEDREKFRVPIEIADEDYDAEQEPFKNTLVRHQDRFPYFALRYFDYNEIFTNLRFQIDLGTYHFSIYKKQIGDYKESHHLTHKLYGFERIQEFTKQNRPDEWRKFVKTFNSFETSKEPYIPETTPHYHLENQKIGIRFRNDNDKIWPSLKTNSEKNEKSKYKLDKSFQAEAFLSVHELLPMMFYYLLLKTENTDNDNEIETKKKENKNDKQEKHKIEEIIENKITEIYALYDTFANGEIKSIDELEEYCKGKDIEIGHLPKQMIAILKDEHKVMATEAERKQEEMLVDVQKSLESLDNQINEEIENVERKNSSLKSGKIASWLVNDMMRFQPVQKDNEGKPLNNSKANSTEYQLLQRTLAFFGSEHERLAPYFKQTKLIESSNPHPFLKDTEWEKCNNILSFYRSYLEAKKNFLESLKPEDWEKNQYFLKLKEPKTKPKTLVQGWKNGFNLPRGIFTEPIRKWFMKHRENITVAELKRVGLVAKVIPLFFSEEYKDSVQPFYNYHFNVGNINKPDEKNFLNCEERRELLRKKKDEFKKMTDKEKEENPSYLEFKSWNKFERELRLVRNQDIVTWLLCMELFNKKKIKELNVEKIYLKNINTNTTKKEKNTEEKNGEEKNIKEKNNILNRIMPMRLPIKVYGRENFSKNKKKKIRRNTFFTVYIEEKGTKLLKQGNFKALERDRRLGGLFSFVKTPSKAESKSNTISKLRVEYELGEYQKARIEIIKDMLALEKTLIDKYNSLDTDNFNKMLTDWLELKGEPDKASFQNDVDLLIAVRNAFSHNQYPMRNRIAFANINPFSLSSANTSEEKGLGIANQLKDKTHKTIEKIIEIEKPIETKE Prevotella 3MQKQDKLFVDRKKNAIFAFPKYITIMENKEKPEPIYYELTDKHF buccaeWAAFLNLARHNVYTTINHINRRLEIAELKDDGYMMGIKGSWNE (SEQ IDQAKKLDKKVRLRDLIMKHFPFLEAAAYEMTNSKSPNNKEQRE No. 107)KEQSEALSLNNLKNVLFIFLEKLQVLRNYYSHYKYSEESPKPIFE WP_004343973.1TSLLKNMYKVFDANVRLVKRDYMHHENIDMQRDFTHLNRKKQVGRTKNIIDSPNFHYHFADKEGNMTIAGLLFFVSLFLDKKDAIWMQKKLKGFKDGRNLREQMTNEVFCRSRISLPKLKLENVQTKDWMQLDMLNELVRCPKSLYERLREKDRESFKVPFDIFSDDYNAEEEPFKNTLVRHQDRFPYFVLRYFDLNEIFEQLRFQIDLGTYHFSIYNKRIGDEDEVRHLTHHLYGFARIQDFAPQNQPEEWRKLVKDLDHFETSQEPYISKTAPHYHLENEKIGIKFCSAHNNLFPSLQTDKTCNGRSKFNLGTQFTAEAFLSVHELLPMMFYYLLLTKDYSRKESADKVEGIIRKEISNIYAIYDAFANNEINSIADLTRRLQNTNILQGHLPKQMISILKGRQKDMGKEAERKIGEMIDDTQRRLDLLCKQTNQKIRIGKRNAGLLKSGKIADWLVNDMMRFQPVQKDQNNIPINNSKANSTEYRMLQRALALFGSENFRLKAYFNQMNLVGNDNPHPFLAETQWEHQTNILSFYRNYLEARKKYLKGLKPQNWKQYQHFLILKVQKTNRNTLVTGWKNSFNLPRGIFTQPIREWFEKHNNSKRIYDQILSFDRVGFVAKAIPLYFAEEYKDNVQPFYDYPFNIGNRLKPKKRQFLDKKERVELWQKNKELFKNYPSEKKKTDLAYLDFLSWKKFERELRLIKNQDIVTWLMFKELFNMATVEGLKIGEIHLRDIDTNTANEESNNILNRIMPMKLPVKTYETDNKGNILKERPLATFYIEETETKVLKQGNFKALVKDRRLNGLFSFAETTDLNLEEHPISKLSVDLELIKYQTTRISIFEMTLGLEKKLIDKYSTLPTDSFRNMLERWLQCKANRPELKNYVNSLIAVRNAFSHNQYPMYDATLFAEVKKFTLFPSVDTKKIELNIAPQLLEIVGKAIKEIEKSENKN Porphyromonas 4MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 108)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFAVFFKPDDFVLAKNRKEQLISVADGKECLTVSGFAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLDEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPQSMGFISVHDLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMDQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELRLLDPSSGHPFLSATMETAHRYTEGFYKCYLEKKREWLAKIFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKVMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVRDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL Bacteroides 5MESIKNSQKSTGKTLQKDPPYFGLYLNMALLNVRKVENHIRKW pyogenesLGDVALLPEKSGFHSLLTTDNLSSAKWTRFYYKSRKFLPFLEMF (SEQ IDDSDKKSYENRRETAECLDTIDRQKISSLLKEVYGKLQDIRNAFS No. 109)HYHIDDQSVKHTALIISSEMHRFIENAYSFALQKTRARFTGVFVETDFLQAEEKGDNKKFFAIGGNEGIKLKDNALIFLICLFLDREEAFKFLSRATGFKSTKEKGFLAVRETFCALCCRQPHERLLSVNPREALLMDMLNELNRCPDILFEMLDEKDQKSFLPLLGEEEQAHILENSLNDELCEAIDDPFEMIASLSKRVRYKNRFPYLMLRYIEEKNLLPFIRFRIDLGCLELASYPKKMGEENNYERSVTDHAMAFGRLTDFHNEDAVLQQITKGITDEVRFSLYAPRYAIYNNKIGFVRTSGSDKISFPTLKKKGGEGHCVAYTLQNTKSFGFISIYDLRKILLLSFLDKDKAKNIVSGLLEQCEKHWKDLSENLFDAIRTELQKEFPVPLIRYTLPRSKGGKLVSSKLADKQEKYESEFERRKEKLTEILSEKDFDLSQIPRRMIDEWLNVLPTSREKKLKGYVETLKLDCRERLRVFEKREKGEHPLPPRIGEMATDLAKDIIRMVIDQGVKQRITSAYYSEIQRCLAQYAGDDNRRHLDSIIRELRLKDTKNGHPFLGKVLRPGLGHTEKLYQRYFEEKKEWLEATFYPAASPKRVPRFVNPPTGKQKELPLIIRNLMKERPEWRDWKQRKNSHPIDLPSQLFENEICRLLKDKIGKEPSGKLKWNEMFKLYWDKEFPNGMQRFYRCKRRVEVFDKVVEYEYSEEGGNYKKYYEALIDEVVRQKISSSKEKSKLQVEDLTLSVRRVFKRAINEKEYQLRLLCEDDRLLFMAVRDLYDWKEAQLDLDKIDNMLGEPVSVSQVIQLEGGQPDAVIKAECKLKDVSKLMRYCYDGRVKGLMPYFANHEATQEQVEMELRHYEDHRRRVFNWVFALEKSVLKNEKLRRFYEESQGGCEHRRCIDALRKASLVSEEEYEFLVHIRNKSAHNQFPDLEIGKLPPNVTSGFCECIWSKYKAIICRI IPFIDPERRFFGKLLEQKAlistipes 6 MSNEIGAFREHQFAYAPGNEKQEEATFATYFNLALSNVEGMMF sp.GEVESNPDKIEKSLDTLPPAILRQIASFIWLSKEDHPDKAYSTEE ZOR0009VKVIVTDLVRRLCFYRNYFSHCFYLDTQYFYSDELVDTTAIGEK (SEQ IDLPYNFHHFITNRLFRYSLPEITLFRWNEGERKYEILRDGLIFFCCL No. 110)FLKRGQAERFLNELRFFKRTDEEGRIKRTIFTKYCTRESHKHIGIEEQDFLIFQDIIGDLNRVPKVCDGVVDLSKENERYIKNRETSNESDENKARYRLLIREKDKFPYYLMRYIVDFGVLPCITFKQNDYSTKEGRGQFHYQDAAVAQEERCYNFVVRNGNVYYSYMPQAQNVVRISELQGTISVEELRNMVYASINGKDVNKSVEQYLYHLHLLYEKILTISGQTIKEGRVDVEDYRPLLDKLLLRPASNGEELRRELRKLLPKRVCDLLSNRFDCSEGVSAVEKRLKAILLRHEQLLLSQNPALHIDKIKSVIDYLYLFFSDDEKFRQQPTEKAHRGLKDEEFQMYHYLVGDYDSHPLALWKELEASGRLKPEMRKLTSATSLHGLYMLCLKGTVEWCRKQLMSIGKGTAKVEAIADRVGLKLYDKLKEYTPEQLEREVKLVVMHGYAAAATPKPKAQAAIPSKLTELRFYSFLGKREMSFAAFIRQDKKAQKLWLRNFYTVENIKTLQKRQAAADAACKKLYNLVGEVERVHTNDKVLVLVAQRYRERLLNVGSKCAVTLDNPERQQKLADVYEVQNAWLSIRFDDLDFTLTHVNLSNLRKAYNLIPRKHILAFKEYLDNRVKQKLCEECRNVRRKEDLCTCCSPRYSNLTSWLKENHSESSIEREAATMMLLDVERKLLSFLLDERRKAIIEYGKFIPFSALVKECRLADAGLCGIRNDVLHDNVISYADAIGKLSAYFPKEASEAVEYIRRTKEVREQRREELMANSSQ Prevotella 7aMSKECKKQRQEKKRRLQKANFSISLTGKHVFGAYFNMARTNF sp.VKTINYILPIAGVRGNYSENQINKMLHALFLIQAGRNEELTTEQK MA2016QWEKKLRLNPEQQTKFQKLLFKHFPVLGPMMADVADHKAYL (SEQ IDNKKKSTVQTEDETFAMLKGVSLADCLDIICLMADTLTECRNFY No. 111)THKDPYNKPSQLADQYLHQEMIAKKLDKVVVASRRILKDREGLSVNEVEFLTGIDHLHQEVLKDEFGNAKVKDGKVMKTFVEYDDFYFKISGKRLVNGYTVTTKDDKPVNVNTMLPALSDFGLLYFCVLFLSKPYAKLFIDEVRLFEYSPFDDKENMIMSEMLSIYRIRTPRLHKIDSHDSKATLAMDIFGELRRCPMELYNLLDKNAGQPFFHDEVKHPNSHTPDVSKRLRYDDRFPTLALRYIDETELFKRIRFQLQLGSFRYKFYDKENCIDGRVRVRRIQKEINGYGRMQEVADKRMDKWGDLIQKREERSVKLEHEELYINLDQFLEDTADSTPYVTDRRPAYNIHANRIGLYWEDSQNPKQYKVFDENGMYIPELVVTEDKKAPIKMPAPRCALSVYDLPAMLFYEYLREQQDNEFPSAEQVIIEYEDDYRKFFKAVAEGKLKPFKRPKEFRDFLKKEYPKLRMADIPKKLQLFLCSHGLCYNNKPETVYERLDRLTLQHLEERELHIQNRLEHYQKDRDMIGNKDNQYGKKSFSDVRHGALARYLAQSMMEWQPTKLKDKEKGHDKLTGLNYNVLTAYLATYGHPQVPEEGFTPRTLEQVLINAHLIGGSNPHPFINKVLALGNRNIEELYLHYLEEELKHIRSRIQSLSSNPSDKALSALPFIHHDRMRYHERTSEEMMALAARYTTIQLPDGLFTPYILEILQKHYTENSDLQNALSQDVPVKLNPTCNAAYLITLFYQTVLKDNAQPFYLSDKTYTRNKDGEKAESFSFKRAYELFSVLNNNKKDTFPFEMIPLFLTSDEIQERLSAKLLDGDGNPVPEVGEKGKPATDSQGNTIWKRRIYSEVDDYAEKLTDRDMKISFKGEWEKLPRWKQDKIIKRRDETRRQMRDELLQRMPRYIRDIKDNERTLRRYKTQDMVLFLLAEKMFTNIISEQSSEFNWKQMRLSKVCNEAFLRQTLTFRVPVTVGETTIYVEQENMSLKNYGEFYRFLTDDRLMSLLNNIVETLKPNENGDLVIRHTDLMSELAAYDQYRSTIFMLIQSIENLIITNNAVLDDPDADGFWVREDLPKRNNFASLLELINQLNNVELTDDERKLLVAIRNAFSHNSYNIDFSLIKDVKHLPE VAKGILQHLQSMLGVEITKPrevotella 7b MSKECKKQRQEKKRRLQKANFSISLTGKHVFGAYFNMARTNF sp.VKTINYILPIAGVRGNYSENQINKMLHALFLIQAGRNEELTTEQK MA2016QWEKKLRLNPEQQTKFQKLLFKHFPVLGPMMADVADHKAYL (SEQ IDNKKKSTVQTEDETFAMLKGVSLADCLDIICLMADTLTECRNFY No. 112)THKDPYNKPSQLADQYLHQEMIAKKLDKVVVASRRILKDREGLSVNEVEFLTGIDHLHQEVLKDEFGNAKVKDGKVMKTFVEYDDFYFKISGKRLVNGYTVTTKDDKPVNVNTMLPALSDFGLLYFCVLFLSKPYAKLFIDEVRLFEYSPFDDKENMIMSEMLSIYRIRTPRLHKIDSHDSKATLAMDIFGELRRCPMELYNLLDKNAGQPFFHDEVKHPNSHTPDVSKRLRYDDRFPTLALRYIDETELFKRIRFQLQLGSFRYKFYDKENCIDGRVRVRRIQKEINGYGRMQEVADKRMDKWGDLIQKREERSVKLEHEELYINLDQFLEDTADSTPYVTDRRPAYNIHANRIGLYWEDSQNPKQYKVFDENGMYIPELVVTEDKKAPIKMPAPRCALSVYDLPAMLFYEYLREQQDNEFPSAEQVIIEYEDDYRKFFKAVAEGKLKPFKRPKEFRDFLKKEYPKLRMADIPKKLQLFLCSHGLCYNNKPETVYERLDRLTLQHLEERELHIQNRLEHYQKDRDMIGNKDNQYGKKSFSDVRHGALARYLAQSMMEWQPTKLKDKEKGHDKLTGLNYNVLTAYLATYGHPQVPEEGFTPRTLEQVLINAHLIGGSNPHPFINKVLALGNRNIEELYLHYLEEELKHIRSRIQSLSSNPSDKALSALPFIHHDRMRYHERTSEEMMALAARYTTIQLPDGLFTPYILEILQKHYTENSDLQNALSQDVPVKLNPTCNAAYLITLFYQTVLKDNAQPFYLSDKTYTRNKDGEKAESFSFKRAYELFSVLNNNKKDTFPFEMIPLFLTSDEIQERLSAKLLDGDGNPVPEVGEKGKPATDSQGNTIWKRRIYSEVDDYAEKLTDRDMKISFKGEWEKLPRWKQDKIIKRRDETRRQMRDELLQRMPRYIRDIKDNERTLRRYKTQDMVLFLLAEKMFTNIISEQSSEFNWKQMRLSKVCNEAFLRQTLTFRVPVTVGETTIYVEQENMSLKNYGEFYRFLTDDRLMSLLNNIVETLKPNENGDLVIRHTDLMSELAAYDQYRSTIFMLIQSIENLIITNNAVLDDPDADGFWVREDLPKRNNFASLLELINQLNNVELTDDERKLLVAIRNAFSHNSYNIDFSLIKDVKHLPE VAKGILQHLQSMLGVEITKRiemerella 8 MEKPLLPNVYTLKHKFFWGAFLNIARHNAFITICHINEQLGLKT anatipestiferPSNDDKIVDVVCETWNNILNNDHDLLKKSQLTELILKHFPFLTA (SEQ IDMCYHPPKKEGKKKGHQKEQQKEKESEAQSQAEALNPSKLIEAL No. 113)EILVNQLHSLRNYYSHYKHKKPDAEKDIFKHLYKAFDASLRMVKEDYKAHFTVNLTRDFAHLNRKGKNKQDNPDFNRYRFEKDGFFTESGLLFFTNLFLDKRDAYWMLKKVSGFKASHKQREKMTTEVFCRSRILLPKLRLESRYDHNQMLLDMLSELSRCPKLLYEKLSEENKKHFQVEADGFLDEIEEEQNPFKDTLIRHQDRFPYFALRYLDLNESFKSIRFQVDLGTYHYCIYDKKIGDEQEKRHLTRTLLSFGRLQDFTEINRPQEWKALTKDLDYKETSNQPFISKTTPHYHITDNKIGFRLGTSKELYPSLEIKDGANRIAKYPYNSGFVAHAFISVHELLPLMFYQHLTGKSEDLLKETVRHIQRIYKDFEEERINTIEDLEKANQGRLPLGAFPKQMLGLLQNKQPDLSEKAKIKIEKLIAETKLLSHRLNTKLKSSPKLGKRREKLIKTGVLADWLVKDFMRFQPVAYDAQNQPIKSSKANSTEFWFIRRALALYGGEKNRLEGYFKQTNLIGNTNPHPFLNKFNWKACRNLVDFYQQYLEQREKFLEAIKNQPWEPYQYCLLLKIPKENRKNLVKGWEQGGISLPRGLFTEAIRETLSEDLMLSKPIRKEIKKHGRVGFISRAITLYFKEKYQDKHQSFYNLSYKLEAKAPLLKREEHYEYWQQNKPQSPTESQRLELHTSDRWKDYLLYKRWQHLEKKLRLYRNQDVMLWLMTLELTKNHFKELNLNYHQLKLENLAVNVQEADAKLNPLNQTLPMVLPVKVYPATAFGEVQYHKTPIRTVYIREEHTKALKMGNFKALVKDRRLNGLFSFIKEENDTQKHPISQLRLRRELEIYQSLRVDAFKETLSLEEKLLNKHTSLSSLENEFRALLEEWKKEYAASSMVTDEHIAFIASVRNAFCHNQYPFYKEALHAPIPLFTVAQPTTEEKDGLGIAEALLKVLREYC EIVKSQI Prevotella 9MEDDKKTTGSISYELKDKHFWAAFLNLARHNVYITINHINKLLE aurantiacaIREIDNDEKVLDIKTLWQKGNKDLNQKARLRELMTKHFPFLET (SEQ IDAIYTKNKEDKKEVKQEKQAEAQSLESLKDCLFLFLDKLQEARN No. 114)YYSHYKYSEFSKEPEFEEGLLEKMYNIFGNNIQLVINDYQHNKDINPDEDFKHLDRKGQFKYSFADNEGNITESGLLFFVSLFLEKKDAIWMQQKLNGFKDNLENKKKMTHEVFCRSRILMPKLRLESTQTQDWILLDMLNELIRCPKSLYERLQGDDREKFKVPFDPADEDYNAEQEPFKNTLIRHQDRFPYFVLRYFDYNEIFKNLRFQIDLGTYHFSIYKKLIGGQKEDRHLTHKLYGFERIQEFAKQNRPDEWKAIVKDLDTYETSNKRYISETTPHYHLENQKIGIRFRNGNKEIWPSLKTNDENNEKSKYKLDKQYQAEAFLSVHELLPMMFYYLLLKKEKPNNDEINASIVEGFIKREIRNIFKLYDAFANGEINNIDDLEKYCADKGIPKRHLPKQMVAILYDEHKDMVKEAKRKQKEMVKDTKKLLATLEKQTQKEKEDDGRNVKLLKSGEIARWLVNDMMRFQPVQKDNEGKPLNNSKANSTEYQMLQRSLALYNNEEKPTRYFRQVNLIESNNPHPFLKWTKWEECNNILTFYYSYLTKKIEFLNKLKPEDWKKNQYFLKLKEPKTNRETLVQGWKNGFNLPRGIFTEPIREWFKRHQNNSKEYEKVEALDRVGLVTKVIPLFFKEEYFKDKEENFKEDTQKEINDCVQPFYNFPYNVGNIHKPKEKDFLHREERIELWDKKKDKFKGYKEKIKSKKLTEKDKEEFRSYLEFQSWNKFERELRLVRNQDIVTWLLCKELIDKLKIDELNIEELKKLRLNNIDTDTAKKEKNNILNRVMPMELPVTVYEIDDSHKIVKDKPLHTIYIKEAETKLLKQGNFKALVKDRRLNGLFSFVKTNSEAESKRNPISKLRVEYELGEYQEARIEIIQDMLALEEKLINKYKDLPTNKFSEMLNSWLEGKDEADKARFQNDVDFLIAVRNAFSHNQYPMHNKIEFANIKPFSLYTANNSEEKGLGIANQLKDKTKETTDKIKKIEKPIETKE Prevotella 10MEDKPFWAAFFNLARHNVYLTVNHINKLLDLEKLYDEGKHKEI saccharolyticaFEREDIFNISDDVMNDANSNGKKRKLDIKKIWDDLDTDLTRKY (SEQQLRELILKHFPFIQPAIIGAQTKERTTIDKDKRSTSTSNDSLKQTG ID No.EGDINDLLSLSNVKSMFFRLLQILEQLRNYYSHVKHSKSATMPN 115)FDEDLLNWMRYIFIDSVNKVKEDYSSNSVIDPNTSFSHLIYKDEQGKIKPCRYPFTSKDGSINAFGLLFFVSLFLEKQDSIWMQKKIPGFKKASENYMKMTNEVFCRNHILLPKIRLETVYDKDWMLLDMLNEVVRCPLSLYKRLTPAAQNKFKVPEKSSDNANRQEDDNPFSRILVRHQNRFPYFVLRFFDLNEVFTTLRFQINLGCYHFAICKKQIGDKKEVHHLIRTLYGFSRLQNFTQNTRPEEWNTLVKTTEPSSGNDGKTVQGVPLPYISYTIPHYQIENEKIGIKIFDGDTAVDTDIWPSVSTEKQLNKPDKYTLTPGFKADVFLSVHELLPMMFYYQLLLCEGMLKTDAGNAVEKVLIDTRNAIFNLYDAFVQEKINTITDLENYLQDKPILIGHLPKQMIDLLKGHQRDMLKAVEQKKAMLIKDTERRLKLLDKQLKQETDVAAKNTGTLLKNGQIADWLVNDMMRFQPVKRDKEGNPINCSKANSTEYQMLQRAFAFYATDSCRLSRYFTQLHLIHSDNSHLFLSRFEYDKQPNLIAFYAAYLKAKLEFLNELQPQNWASDNYFLLLRAPKNDRQKLAEGWKNGFNLPRGLFTEKIKTWFNEHKTIVDISDCDIFKNRVGQVARLIPVFFDKKFKDHSQPFYRYDFNVGNVSKPTEANYLSKGKREELFKSYQNKFKNNIPAEKTKEYREYKNFSLWKKFERELRLIKNQDILIWLMCKNLFDEKIKPKKDILEPRIAVSYIKLDSLQTNTSTAGSLNALAKVVPMTLAIHIDSPKPKGKAGNNEKENKEFTVYIKEEGTKLLKWGNFKTLLADRRIKGLFSYIEHDDIDLKQHPLTKRRVDLELDLYQTCRIDIFQQTLGLEAQLLDKYSDLNTDNFYQMLIGWRKKEGIPRNIKEDTDFLKDVRNAFSHNQYPDSKKIAFRRIRKFNPKELILEEEEGLGIATQMYKEVEKV VNRIKRIELFDHMPREF9712_03108 11 MKDILTTDTTEKQNRFYSHKIADKYFFGGYFNLASNNIYEVFEE[Myroides VNKRNTFGKLAKRDNGNLKNYIIHVFKDELSISDFEKRVAIFAS odoratimimusYFPILETVDKKSIKERNRTIDLTLSQRIRQFREMLISLVTAVDQLR CCUG 10230]NFYTHYHHSDIVIENKVLDFLNSSFVSTALHVKDKYLKTDKTKE (SEQ IDFLKETIAAELDILIEAYKKKQIEKKNTRFKANKREDILNAIYNEA No. 116)FWSFINDKDKDKDKETVVAKGADAYFEKNHHKSNDPDFALNISEKGIVYLLSFFLTNKEMDSLKANLTGFKGKVDRESGNSIKYMATQRIYSFHTYRGLKQKIRTSEEGVKETLLMQMIDELSKVPNVVYQHLSTTQQNSFIEDWNEYYKDYEDDVETDDLSRVIHPVIRKRYEDRFNYFAIRFLDEFFDFPTLRFQVHLGDYVHDRRTKQLGKVESDRIIKEKVTVFARLKDINSAKASYFHSLEEQDKEELDNKWTLFPNPSYDFPKEHTLQHQGEQKNAGKIGIYVKLRDTQYKEKAALEEARKSLNPKERSATKASKYDIITQIIEANDNVKSEKPLVFTGQPIAYLSMNDIHSMLFSLLTDNAELKKTPEEVEAKLIDQIGKQINEILSKDTDTKILKKYKDNDLKETDTDKITRDLARDKEEIEKLILEQKQRADDYNYTSSTKFNIDKSRKRKHLLFNAEKGKIGVWLANDIKRFMFKESKSKWKGYQHTELQKLFAYFDTSKSDLELILSNMVMVKDYPIELIDLVKKSRTLVDFLNKYLEARLEYIENVITRVKNSIGTPQFKTVRKECFTFLKKSNYTVVSLDKQVERILSMPLFIERGFMDDKPTMLEGKSYKQHKEKFADWFVHYKENSNYQNFYDTEVYEITTEDKREKAKVTKKIKQQQKNDVFTLMMVNYMLEEVLKLSSNDRLSLNELYQTKEERIVNKQVAKDTQERNKNYIWNKVVDLQLCDGLVHIDNVKLKDIGNFRKYENDSRVKEFLTYQSDIVWSAYLSNEVDSNKLYVIERQLDNYESIRSKELLKEVQEIECSVYNQVANKESLKQSGNENFKQYVLQGLLPIGMDVREMLILSTDVKFKKEEIIQLGQAGEVEQDLYSLIYIRNKFAHNQLPIKEFFDFCENNYRSISD NEYYAEYYMEIFRSIKEKYANPrevotella 12 MEDDKKTTDSIRYELKDKHFWAAFLNLARHNVYITVNHINKIL intermediaEEDEINRDGYENTLENSWNEIKDINKKDRLSKLIIKHFPFLEATT (SEQ IDYRQNPTDTTKQKEEKQAEAQSLESLKKSFFVFIYKLRDLRNHYS No. 117)HYKHSKSLERPKFEEDLQNKMYNIFDVSIQFVKEDYKHNTDINPKKDFKHLDRKRKGKFHYSFADNEGNITESGLLFFVSLFLEKKDAIWVQKKLEGFKCSNKSYQKMTNEVFCRSRMLLPKLRLESTQTQDWILLDMLNELIRCPKSLYERLQGVNRKKFYVSFDPADEDYDAEQEPFKNTLVRHQDRFPYFALRYFDYNEVFANLRFQIDLGTYHFSIYKKLIGGQKEDRHLTHKLYGFERIQEFDKQNRPDEWKAIVKDSDTFKKKEEKEEEKPYISETTPHYHLENKKIGIAFKNHNIWPSTQTELTNNKRKKYNLGTSIKAEAFLSVHELLPMMFYYLLLKTENTKNDNKVGGKKETKKQGKHKIEAIIESKIKDIYALYDAFANGEINSEDELKEYLKGKDIKIVHLPKQMIAILKNEHKDMAEKAEAKQEKMKLATENRLKTLDKQLKGKIQNGKRYNSAPKSGEIASWLVNDMMRFQPVQKDENGESLNNSKANSTEYQLLQRTLAFFGSEHERLAPYFKQTKLIESSNPHPFLNDTEWEKCSNILSFYRSYLKARKNFLESLKPEDWEKNQYFLMLKEPKTNRETLVQGWKNGFNLPRGFFTEPIRKWFMEHWKSIKVDDLKRVGLVAKVTPLFFSEKYKDSVQPFYNYPFNVGDVNKPKEEDFLHREERIELWDKKKDKFKGYKAKKKFKEMTDKEKEEHRSYLEFQSWNKFERELRLVRNQDIVTWLLCTELIDKLKIDELNIKELKKLRLKDINTDTAKKEKNNILNRVMPMELPVTVYKVNKGGYIIKNKPLHTIYIKEAETKLLKQGNFKALVKDRRLNGLFSFVKTPSEAESESNPISKLRVEYELGKYQNARLDIIEDMLALEKKLIDKYNSLDTDNFHNMLTGWLELKGEAKKARFQNDVKLLTAVRNAFSHNQYPMYDENLFGNIERFSLSSSNIIESKGLDIAAKLKEEVSKAAKKIQNEEDNKKEKET Capnocytophaga 13MKNIQRLGKGNEFSPFKKEDKFYFGGFLNLANNNIEDFFKEIITR canimorsusFGIVITDENKKPKETFGEKILNEIFKKDISIVDYEKWVNIFADYFP (SEQ IDFTKYLSLYLEEMQFKNRVICFRDVMKELLKTVEALRNFYTHYD No. 118)HEPIKIEDRVFYFLDKVLLDVSLTVKNKYLKTDKTKEFLNQHIGEELKELCKQRKDYLVGKGKRIDKESEIINGIYNNAFKDFICKREKQDDKENHNSVEKILCNKEPQNKKQKSSATVWELCSKSSSKYTEKSFPNRENDKHCLEVPISQKGIVFLLSFFLNKGEIYALTSNIKGFKAKITKEEPVTYDKNSIRYMATHRMFSFLAYKGLKRKIRTSEINYNEDGQASSTYEKETLMLQMLDELNKVPDVVYQNLSEDVQKTFIEDWNEYLKENNGDVGTMEEEQVIHPVIRKRYEDKFNYFAIRFLDEFAQFPTLRFQVHLGNYLCDKRTKQICDTTTEREVKKKITVFGRLSELENKKAIFLNEREEIKGWEVFPNPSYDFPKENISVNYKDFPIVGSILDREKQPVSNKIGIRVKIADELQREIDKAIKEKKLRNPKNRKANQDEKQKERLVNEIVSTNSNEQGEPVVFIGQPTAYLSMNDIHSVLYEFLINKISGEALETKIVEKIETQIKQIIGKDATTKILKPYTNANSNSINREKLLRDLEQEQQILKTLLEEQQQREKDKKDKKSKRKHELYPSEKGKVAVWLANDIKRFMPKAFKEQWRGYHHSLLQKYLAYYEQSKEELKNLLPKEVFKHFPFKLKGYFQQQYLNQFYTDYLKRRLSYVNELLLNIQNFKNDKDALKATEKECFKFFRKQNYIINPINIQIQSILVYPIFLKRGFLDEKPTMIDREKFKENKDTELADWFMHYKNYKEDNYQKFYAYPLEKVEEKEKFKRNKQINKQKKNDVYTLMMVEYIIQKIFGDKFVEENPLVLKGIFQSKAERQQNNTHAATTQERNLNGILNQPKDIKIQGKITVKGVKLKDIGNFRKYEIDQRVNTFLDYEPRKEWMAYLPNDWKEKEKQGQLPPNNVIDRQISKYETVRSKILLKDVQELEKIISDEIKEEHRHDLKQGKYYNFKYYILNGLLRQLKNENVENYKVFKLNTNPEKVNITQLKQEATDLEQKAFVLTYIRNKFAHNQLPKKEFWDYCQEKYGKIEKEKTYAEYFAE VFKREKEALIK Porphyromonas 14MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDQDVLSFKALWKNFDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 119)LKSILFDFLQKLKDFRNYYSHYRHSGSSELPLFDGNMLQRLYNVFDVSVQRVKIDHEHNDEVDPHYHFNHLVRKGKKDRYGHNDNPSFKHHFVDGEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRMDDWMLLDMLNELVRCPKPLYDRLREDDRACFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTSPHYHIEKGKIGLRFMPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAERVQGRIKRVIEDVYAVYDAFARDEINTRDELDACLADKGIRRGHLPRQMIAILSQEHKDMEEKIRKKLQEMMADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDASGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLRARKAFLERIGRSDRVENRPFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGHDEVASYKEVGFMAKAVPLYFERACEDRVQPFYDSPFNVGNSLKPKKGRFLSKEERAEEWERGKERFRDLEAWSYSAARRIEDAFAGIEYASPGNKKKIEQLLRDLSLWEAFESKLKVRADRINLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRPNVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCVFELTLRLEESLLTRYPHLPDESFREMLESWSDPLLAKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAK ETVERIIQA Prevotella 15MNIPALVENQKKYFGTYSVMAMLNAQTVLDHIQKVADIEGEQ sp. P5-125NENNENLWFHPVMSHLYNAKNGYDKQPEKTMFIIERLQSYFPF (SEQ IDLKIMAENQREYSNGKYKQNRVEVNSNDIFEVLKRAFGVLKMY No. 120)RDLTNHYKTYEEKLNDGCEFLTSTEQPLSGMINNYYTVALRNMNERYGYKTEDLAFIQDKRFKFVKDAYGKKKSQVNTGFFLSLQDYNGDTQKKLHLSGVGIALLICLFLDKQYINIFLSRLPIFSSYNAQSEERRIIIRSFGINSIKLPKDRIHSEKSNKSVAMDMLNEVKRCPDELFTTLSAEKQSRFRIISDDHNEVLMKRSSDRFVPLLLQYIDYGKLFDHIRFHVNMGKLRYLLKADKTCIDGQTRVRVIEQPLNGFGRLEEAETMRKQENGTFGNSGIRIRDFENMKRDDANPANYPYIVDTYTHYILENNKVEMFINDKEDSAPLLPVIEDDRYVVKTIPSCRMSTLEIPAMAFHMFLFGSKKTEKLIVDVHNRYKRLFQAMQKEEVTAENIASFGIAESDLPQKILDLISGNAHGKDVDAFIRLTVDDMLTDTERRIKRFKDDRKSIRSADNKMGKRGFKQISTGKLADFLAKDIVLFQPSVNDGENKITGLNYRIMQSAIAVYDSGDDYEAKQQFKLMFEKARLIGKGTTEPHPFLYKVFARSIPANAVEFYERYLIERKFYLTGLSNEIKKGNRVDVPFIRRDQNKWKTPAMKTLGRIYSEDLPVELPRQMFDNEIKSHLKSLPQMEGIDFNNANVTYLIAEYMKRVLDDDFQTFYQWNRNYRYMDMLKGEYDRKGSLQHCFTSVEEREGLWKERASRTERYRKQASNKIRSNRQMRNASSEEIETILDKRLSNSRNEYQKSEKVIRRYRVQDALLFLLAKKTLTELADFDGERFKLKEIMPDAEKGILSEIMPMSFTFEKGGKKYTITSEGMKLKNYGDFFVLASDKRIGNLLELVGSDIVSKEDIMEEFNKYDQCRPEISSIVFNLEKWAFDTYPELSARVDREEKVDFKSILKILLNNKNINKEQSDILRKIRNAFDHNNYPDKGVVEIKALPEIAMSIKKAFGEYAIMK Flavobacterium 16MENLNKILDKENEICISKIFNTKGIAAPITEKALDNIKSKQKNDL branchiophilumNKEARLHYFSIGHSFKQIDTKKVFDYVLIEELKDEKPLKFITLQK (SEQDFFTKEFSIKLQKLINSIRNINNHYVHNFNDINLNKIDSNVFHFLK ID No.ESFELAIIEKYYKVNKKYPLDNEIVLFLKELFIKDENTALLNYFT 121)NLSKDEAIEYILTFTITENKIWNINNEHNILNIEKGKYLTFEAMLFLITIFLYKNEANHLLPKLYDFKNNKSKQELFTFFSKKFTSQDIDAEEGHLIKFRDMIQYLNHYPTAWNNDLKLESENKNKIMTTKLIDSIIEFELNSNYPSFATDIQFKKEAKAFLFASNKKRNQTSFSNKSYNEEIRHNPHIKQYRDEIASALTPISFNVKEDKFKIFVKKHVLEEYFPNSIGYEKFLEYNDFTEKEKEDFGLKLYSNPKTNKLIERIDNHKLVKSHGRNQDRFMDFSMRFLAENNYFGKDAFFKCYKFYDTQEQDEFLQSNENNDDVKFHKGKVTTYIKYEEHLKNYSYWDCPFVEENNSMSVKISIGSEEKILKIQRNLMIYFLENALYNENVENQGYKLVNNYYRELKKDVEESIASLDLIKSNPDFKSKYKKILPKRLLHNYAPAKQDKAPENAFETLLKKADFREEQYKKLLKKAEHEKNKEDFVKRNKGKQFKLHFIRKACQMMYFKEKYNTLKEGNAAFEKKDPVIEKRKNKEHEFGHHKNLNITREEFNDYCKWMFAFNGNDSYKKYLRDLFSEKHFFDNQEYKNLFESSVNLEAFYAKTKELFKKWIETNKPTNNENRYTLENYKNLILQKQVFINVYHFSKYLIDKNLLNSENNVIQYKSLENVEYLISDFYFQSKLSIDQYKTCGKLFNKLKSNKLEDCLLYEIAYNYIDKKNVHKIDIQKILTSKIILTINDANTPYKISVPFNKLERYTEMIAIKNQNNLKARFLIDLPLYLSKNKIKKGKDSAGYEIIIKNDLEIEDINTINNKIINDSVKFTEVLMELEKYFILKDKCILSKNYIDNSEIPSLKQFSKVWIKENENEIINYRNIACHFHLPLLETFDNLLLNVEQKFIKEELQNVSTINDLSKPQEYLILLFIKFKHNNFYLNLFNKNESKTIKNDKEVKKNRVLQKFINQVILKKK Myroides 17MKDILTTDTTEKQNRFYSHKIADKYFFGGYFNLASNNIYEVFEE odoratimimusVNKRNTFGKLAKRDNGNLKNYIIHVFKDELSISDFEKRVAIFAS (SEQ IDYFPILETVDKKSIKERNRTIDLTLSQRIRQFREMLISLVTAVDQLR No. 122)NFYTHYHHSDIVIENKVLDFLNSSFVSTALHVKDKYLKTDKTKEFLKETIAAELDILIEAYKKKQIEKKNTRFKANKREDILNAIYNEAFWSFINDKDKDKDKETVVAKGADAYFEKNHHKSNDPDFALNISEKGIVYLLSFFLTNKEMDSLKANLTGFKGKVDRESGNSIKYMATQRIYSFHTYRGLKQKIRTSEEGVKETLLMQMIDELSKVPNVVYQHLSTTQQNSFIEDWNEYYKDYEDDVETDDLSRVTHPVIRKRYEDRFNYFAIRFLDEFFDFPTLRFQVEILGDYVHDRRTKQLGKVESDRIIKEKVTVFARLKDINSAKASYFHSLEEQDKEELDNKWTLFPNPSYDFPKEHTLQHQGEQKNAGKIGIYVKLRDTQYKEKAALEEARKSLNPKERSATKASKYDIITQIIEANDNVKSEKPLVFTGQPIAYLSMNDIHSMLFSLLTDNAELKKTPEEVEAKLIDQIGKQINEILSKDTDTKILKKYKDNDLKETDTDKITRDLARDKEEIEKLILEQKQRADDYNYTSSTKFNIDKSRKRKHLLFNAEKGKIGVWLANDIKRFMFKESKSKWKGYQHIELQKLFAYFDTSKSDLELILSNMVMVKDYPIELIDLVKKSRTLVDFLNKYLEARLEYIENVITRVKNSIGTPQFKTVRKECFTFLKKSNYTVVSLDKQVERILSMPLFIERGFMDDKPTMLEGKSYKQHKEKFADWFVHYKENSNYQNFYDTEVYEITTEDKREKAKVTKKIKQQQKNDVFTLMMVNYMLEEVLKLSSNDRLSLNELYQTKEERIVNKQVAKDTQERNKNYIWNKVVDLQLCDGLVHIDNVKLKDIGNFRKYENDSRVKEFLTYQSDIVWSAYLSNEVDSNKLYVIERQLDNYESIRSKELLKEVQEIECSVYNQVANKESLKQSGNENFKQYVLQGLLPIGMDVREMLILSTDVKFKKEEIIQLGQAGEVEQDLYSLIYIRNKFAHNQLPIKEFFDFCENNYRSISD NEYYAEYYMEIFRSIKEKYANFlavobacterium 18 MSSKNESYNKQKTFNHYKQEDKYFFGGFLNNADDNLRQVGKE columnareFKTRINFNRNNNELASVFKDYFNKEKSVAKREHALNLLSNYFP (SEQ IDVLERIQKHTNHNFEQTREIFELLLDTIKKLRDYYTHHYHKPITIN No. 123)PKIYDFLDDTLLDVLITIKKKKVKNDTSRELLKEKLRPELTQLKNQKREELIKKGKKLLEENLENAVFNHCLIPFLEENKTDDKQNKTVSLRKYRKSKPNEETSITLTQSGLVFLMSFFLHRKEFQVFTSGLERFKAKVNTIKEEEISLNKNNIVYMITHWSYSYYNFKGLKHRIKTDQGVSTLEQNNTTHSLTNTNTKEALLTQIVDYLSKVPNEIYETLSEKQQKEFEEDINEYMRENPENEDSTFSSIVSHKVIRKRYENKFNYFAMRFLDEYAELPTLRFMVNFGDYIKDRQKKILESIQFDSERIIKKEIHLFEKLSLVTEYKKNVYLKETSNIDLSRFPLFPNPSYVMANNNIPFYIDSRSNNLDEYLNQKKKAQSQNKKRNLTFEKYNKEQSKDAIIAMLQKEIGVKDLQQRSTIGLLSCNELPSMLYEVIVKDIKGAELENKIAQKIREQYQSIRDFTLDSPQKDNIPTTLIKTINTDSSVTFENQPIDIPRLKNALQKELTLTQEKLLNVKEHEIEVDNYNRNKNTYKFKNQPKNKVDDKKLQRKYVFYRNEIRQEANWLASDLIHFMKNKSLWKGYMHNELQSFLAFFEDKKNDCIALLETVFNLKEDCILTKGLKNLFLKHGNFIDFYKEYLKLKEDFLSTESTFLENGFIGLPPKILKKELSKRLKYIFIVFQKRQFIIKELEEKKNNLYADAINLSRGIFDEKPTMIPFKKPNPDEFASWFVASYQYNNYQSFYELTPDIVERDKKKKYKNLRAINKVKIQDYYLKLMVDTLYQDLFNQPLDKSLSDFYVSKAEREKIKADAKAYQKLNDSSLWNKVIHLSLQNNRITANPKLKDIGKYKRALQDEKIATLLTYDARTWTYALQKPEKENENDYKELHYTALNMELQEYEKVRSKELLKQVQELEKKILDKFYDFSNNASHPEDLEIEDKKGKRHPNFKLYITKALLKNESEIINLENIDIEILLKYYDYNTEELKEKIKNMDEDEKAKIINTKENYNKITNVLIKKALVLIIIRNKMAHNQYPPKFIYDLANRFVPKKEEEYFAT YFNRVFETITKELWENKEKKDKTQVPorphyromonas 19 MTEQNEKPYNGTYYTLEDKHFWAAFLNLARHNAYITLAHIDR gingivalisQLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFS (SEQ IDFLEGAAYGKKLFESQSSGNKSSKKKELSKKEKEELQANALSLD No. 124)NLKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDKYGNNDNPFFKHHFVDREGTVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTEAYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRARFRVPVDILSDEDDTDGTEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGATRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAVYDAFARDEINTRDELDACLADKGIRRGHLPRQMIAILSQEHKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVVADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLEARKAFLQSIGRSDRVENHRFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGYDEVGSYKEVGFMAKAVPLYFERASKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAKEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTDVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDKNFRKMLESWSDPLLDKWPDLHGNVRLLIAVRNAFSHNQYPMYDETLFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKEMVERIIQA Porphyromonas 20MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ sp.LAYSKADITNDQDVLSFKALWKNFDNDLERKSRLRSLILKHFSF COT-052LEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN OH4946LKSILFDFLQKLKDFRNYYSHYRHSESSELPLFDGNMLQRLYNV (SEQ IDFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGHNDN No. 125)PSFKHHFVDSEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKPLYDRLREDDRACFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAIYDAFARDEINTLKELDACLADKGIRRGHLPKQMIGILSQERKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLRARKAFLERIGRSDRVENCPFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGYDEVGSYREVGFMAKAVPLYFERACEDRVQPFYDSPFNVGNSLKPKKGRFLSKEDRAEEWERGKERFRDLEAWSHSAARRIKDAFAGIEYASPGNKKKIEQLLRDLSLWEAFESKLKVRADKINLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRPNVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCVFELTLRLEESLLSRYPHLPDESFREMLESWSDPLLAKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKE TVERIIQA Prevotella 21MEDDKKTKESTNMLDNKHFWAAFLNLARHNVYITVNHINKVL intermediaELKNKKDQDIIIDNDQDILAIKTHWEKVNGDLNKTERLRELMTK (SEQ IDHFPFLETAIYTKNKEDKEEVKQEKQAKAQSFDSLKHCLFLFLEK No. 126)LQEARNYYSHYKYSESTKEPMLEKELLKKMYNIFDDNIQLVIKDYQHNKDINPDEDFKHLDRTEEEFNYYFTTNKKGNITASGLLFFVSLFLEKKDAIWMQQKLRGFKDNRESKKKMTHEVFCRSRMLLPKLRLESTQTQDWILLDMLNELIRCPKSLYERLQGEYRKKFNVPFDSADEDYDAEQEPFKNTLVRHQDRFPYFALRYFDYNEIFTNLRFQIDLGTYHFSIYKKLIGGQKEDRHLTHKLYGFERIQEFAKQNRTDEWKAIVKDFDTYETSEEPYISETAPHYHLENQKIGIRFRNDNDEIWPSLKTNGENNEKRKYKLDKQYQAEAFLSVHELLPMMFYYLLLKKEEPNNDKKNASIVEGFIKREIRDIYKLYDAFANGEINNIDDLEKYCEDKGIPKRHLPKQMVAILYDEHKDMAEEAKRKQKEMVKDTKKLLATLEKQTQGEIEDGGRNIRLLKSGEIARWLVNDMMRFQPVQKDNEGNPLNNSKANSTEYQMLQRSLALYNKEEKPTRYFRQVNLINSSNPHPFLKWTKWEECNNILSFYRSYLTKKIEFLNKLKPEDWEKNQYFLKLKEPKTNRETLVQGWKNGFNLPRGIFTEPIREWFKRHQNDSEEYEKVETLDRVGLVTKVIPLFFKKEDSKDKEEYLKKDAQKEINNCVQPFYGFPYNVGNIHKPDEKDFLPSEERKKLWGDKKYKFKGYKAKVKSKKLTDKEKEEYRSYLEFQSWNKFERELRLVRNQDIVTWLLCTELIDKLKVEGLNVEELKKLRLKDIDTDTAKQEKNNILNRVMPMQLPVTVYEIDDSHNIVKDRPLHTVYIEETKTKLLKQGNFKALVKDRRLNGLFSFVDTSSETELKSNPISKSLVEYELGEYQNARIETIKDMLLLEETLIEKYKTLPTDNFSDMLNGWLEGKDEADKARFQNDVKLLVAVRNAFSHNQYPMRNRIAFANINPFSLSSADTSEEKKLDIANQLKDKTHKIIKRIIEIEKPIETK E PIN17_0200 AFJ07523MKMEDDKKTKESTNMLDNKHFWAAFLNLARHNVYITVNHIN [PrevotellaKVLELKNKKDQDIIIDNDQDILAIKTHWEKVNGDLNKTERLREL intermediaMTKHFPFLETAIYTKNKEDKEEVKQEKQAKAQSFDSLKHCLFL 17] (SEQFLEKLQEARNYYSHYKYSESTKEPMLEKELLKKMYNIFDDNIQ ID No.LVIKDYQHNKDINPDEDFKHLDRTEEEFNYYFTTNKKGNITASG 127)LLFFVSLFLEKKDAIWMQQKLRGFKDNRESKKKMTHEVFCRSRMLLPKLRLESTQTQDWILLDMLNELIRCPKSLYERLQGEYRKKFNVPFDSADEDYDAEQEPFKNTLVRHQDRFPYFALRYFDYNEIFTNLRFQIDLGTYHFSIYKKLIGGQKEDRHLTHKLYGFERIQEFAKQNRTDEWKAIVKDFDTYETSEEPYISETAPHYHLENQKIGIRFRNDNDEIWPSLKTNGENNEKRKYKLDKQYQAEAFLSVHELLPMMFYYLLLKKEEPNNDKKNASIVEGFIKREIRDIYKLYDAFANGEINNIDDLEKYCEDKGIPKRHLPKQMVAILYDEHKDMAEEAKRKQKEMVKDTKKLLATLEKQTQGEIEDGGRNIRLLKSGEIARWLVNDMMRFQPVQKDNEGNPLNNSKANSTEYQMLQRSLALYNKEEKPTRYFRQVNLINSSNPHPFLKWTKWEECNNILSFYRSYLTKKIEFLNKLKPEDWEKNQYFLKLKEPKTNRETLVQGWKNGFNLPRGIFTEPIREWFKRHQNDSEEYEKVETLDRVGLVTKVIPLFFKKEDSKDKEEYLKKDAQKEINNCVQPFYGFPYNVGNIHKPDEKDFLPSEERKKLWGDKKYKFKGYKAKVKSKKLTDKEKEEYRSYLEFQSWNKFERELRLVRNQDIVTWLLCTELIDKLKVEGLNVEELKKLRLKDIDTDTAKQEKNNILNRVMPMQLPVTVYEIDDSHNIVKDRPLHTVYIEETKTKLLKQGNFKALVKDRRLNGLFSFVDTSSETELKSNPISKSLVEYELGEYQNARIETIKDMLLLEETLIEKYKTLPTDNFSDMLNGWLEGKDEADKARFQNDVKLLVAVRNAFSHNQYPMRNRIAFANINPFSLSSADTSEEKKLDIANQLKDKTHKIIKRIIEIEKPIETK E Prevotella BAU18623MEDDKKTTDSISYELKDKHFWAAFLNLARHNVYITVNHINKVL intermediaELKNKKDQDIIIDNDQDILAIKTHWEKVNGDLNKTERLRELMTK (SEQ IDHFPFLETAIYSKNKEDKEEVKQEKQAKAQSFDSLKHCLFLFLEK No. 128)LQETRNYYSHYKYSESTKEPMLEKELLKKMYNIFDDNIQLVIKDYQHNKDINPDEDFKHLDRTEEDFNYYFTRNKKGNITESGLLFFVSLFLEKKDAIWMQQKLRGFKDNRESKKKMTHEVFCRSRMLLPKLRLESTQTQDWILLDMLNELIRCPKSLYERLQGEDREKFKVPFDPADEDYDAEQEPFKNTLVRHQDRFPYFALRYFDYNEIFTNLRFQIDLGTFHFSIYKKLIGGQKEDRHLTHKLYGFERIQEFAKQNRPDEWKAIVKDLDTYETSNERYISETTPHYHLENQKIGIRFRNDNDEIWPSLKTNGENNEKSKYKLDKQYQAEAFLSVHELLPMMFYYLLLKKEEPNNDKKNASIVEGFIKREIRDMYKLYDAFANGEINNIDDLEKYCEDKGIPKRHLPKQMVAILYDEHKDMVKEAKRKQRKMVKDTEKLLAALEKQTQEKTEDGGRNIRLLKSGEIARWLVNDMMRFQPVQKDNEGNPLNNSKANSTEYQMLQRSLALYNKEEKPTRYFRQVNLINSSNPHPFLKWTKWEECNNILSFYRSYLTKKIEFLNKLKPEDWEKNQYFLKLKEPKTNRETLVQGWKNGFNLPRGIFTEPIREWFKRHQNDSKEYEKVEALDRVGLVTKVIPLFFKKEDSKDKEEDLKKDAQKEINNCVQPFYSFPYNVGNIHKPDEKDFLHREERIELWDKKKDKFKGYKAKVKSKKLTDKEKEEYRSYLEFQSWNKFERELRLVRNQDIVTWLLCTELIDKLKVEGLNVEELKKLRLKDIDTDTAKQEKNNILNRVMPMQLPVTVYEIDDSHNIVKDRPLHTVYIEETKTKLLKQGNFKALVKDRRLNGLFSFVDTSSEAELKSNPISKSLVEYELGEYQNARIETIKDMLLLEETLIEKYKNLPTDNFSDMLNGWLEGKDEADKARFQNDVKLLVAVRNAFSHNQYPMRNRIAFANINPFSLSSADTSEEKKLDIANQLKDKTHKIIKRIIEIEKPIETKE HMPREF6485_0083 EFU31981MQKQDKLFVDRKKNAIFAFPKYITIMENKEKPEPIYYELTDKHF [PrevotellaWAAFLNLARHNVYTTINHINRRLEIAELKDDGYMMGIKGSWNE buccaeQAKKLDKKVRLRDLIMKHFPFLEAAAYEMTNSKSPNNKEQRE ATCCKEQSEALSLNNLKNVLFIFLEKLQVLRNYYSHYKYSEESPKPIFE 33574]TSLLKNMYKVFDANVRLVKRDYMHHENIDMQRDFTHLNRKK (SEQ IDQVGRTKNIIDSPNFHYHFADKEGNMTIAGLLFFVSLFLDKKDAI No. 129)WMQKKLKGFKDGRNLREQMTNEVFCRSRISLPKLKLENVQTKDWMQLDMLNELVRCPKSLYERLREKDRESFKVPFDIFSDDYNAEEEPFKNTLVRHQDRFPYFVLRYFDLNEIFEQLRFQIDLGTYHFSIYNKRIGDEDEVRHLTHHLYGFARIQDFAPQNQPEEWRKLVKDLDHFETSQEPYISKTAPHYHLENEKIGIKFCSAHNNLFPSLQTDKTCNGRSKFNLGTQFTAEAFLSVHELLPMMFYYLLLTKDYSRKESADKVEGIIRKEISNIYAIYDAFANNEINSIADLTRRLQNTNILQGHLPKQMISILKGRQKDMGKEAERKIGEMIDDTQRRLDLLCKQTNQKIRIGKRNAGLLKSGKIADWLVNDMMRFQPVQKDQNNIPINNSKANSTEYRMLQRALALFGSENFRLKAYFNQMNLVGNDNPHPFLAETQWEHQTNILSFYRNYLEARKKYLKGLKPQNWKQYQHFLILKVQKTNRNTLVTGWKNSFNLPRGIFTQPIREWFEKHNNSKRIYDQILSFDRVGFVAKAIPLYFAEEYKDNVQPFYDYPFNIGNRLKPKKRQFLDKKERVELWQKNKELFKNYPSEKKKTDLAYLDFLSWKKFERELRLIKNQDIVTWLMFKELFNMATVEGLKIGEIHLRDIDTNTANEESNNILNRIMPMKLPVKTYETDNKGNILKERPLATFYIEETETKVLKQGNFKALVKDRRLNGLFSFAETTDLNLEEHPISKLSVDLELIKYQTTRISIFEMTLGLEKKLIDKYSTLPTDSFRNMLERWLQCKANRPELKNYVNSLIAVRNAFSHNQYPMYDATLFAEVKKFTLFPSVDTKKIELNIAPQLLEIVGKAIKEIEKSENKN HMPREF9144_1146 EGQ18444MKEEEKGKTPVVSTYNKDDKHFWAAFLNLARHNVYITVNHIN [PrevotellaKILGEGEINRDGYENTLEKSWNEIKDINKKDRLSKLIIKHFPFLE pallensVTTYQRNSADTTKQKEEKQAEAQSLESLKKSFFVFIYKLRDLRN ATCCHYSHYKHSKSLERPKFEEDLQEKMYNIFDASIQLVKEDYKHNT 700821]DIKTEEDFKHLDRKGQFKYSFADNEGNITESGLLFFVSLFLEKK (SEQ IDDAIWVQKKLEGFKCSNESYQKMTNEVFCRSRMLLPKLRLQSTQ No. 130)TQDWILLDMLNELIRCPKSLYERLREEDRKKFRVPIEIADEDYDAEQEPFKNALVRHQDRFPYFALRYFDYNEIFTNLRFQIDLGTYHFSIYKKQIGDYKESHHLTHKLYGFERIQEFTKQNRPDEWRKFVKTFNSFETSKEPYIPETTPHYHLENQKIGIRFRNDNDKIWPSLKTNSEKNEKSKYKLDKSFQAEAFLSVHELLPMMFYYLLLKTENTDNDNEIETKKKENKNDKQEKHKIEEIIENKITEIYALYDAFANGKINSIDKLEEYCKGKDIEIGHLPKQMIAILKSEHKDMATEAKRKQEEMLADVQKSLESLDNQINEEIENVERKNSSLKSGEIASWLVNDMMRFQPVQKDNEGNPLNNSKANSTEYQMLQRSLALYNKEEKPTRYFRQVNLIESSNPHPFLNNTEWEKCNNILSFYRSYLEAKKNFLESLKPEDWEKNQYFLMLKEPKTNCETLVQGWKNGFNLPRGIFTEPIRKWFMEHRKNITVAELKRVGLVAKVIPLFFSEEYKDSVQPFYNYLFNVGNINKPDEKNFLNCEERRELLRKKKDEFKKMTDKEKEENPSYLEFQSWNKFERELRLVRNQDIVTWLLCMELFNKKKIKELNVEKIYLKNINTNTTKKEKNTEEKNGEEKIIKEKNNILNRIMPMRLPIKVYGRENFSKNKKKKIRRNTFFTVYIEEKGTKLLKQGNFKALERDRRLGGLFSFVKTHSKAESKSNTISKSRVEYELGEYQKARIEIIKDMLALEETLIDKYNSLDTDNFHNMLTGWLKLKDEPDKASFQNDVDLLIAVRNAFSHNQYPMRNRIAFANINPFSLSSANTSEEKGLGIANQLKDKTHKTIEKIIEIEKPIETKE HMPREF9714_02132 EHO08761MKDILTTDTTEKQNRFYSHKIADKYFFGGYFNLASNNIYEVFEE [MyroidesVNKRNTFGKLAKRDNGNLKNYIIHVFKDELSISDFEKRVAIFAS odoratimimusYFPILETVDKKSIKERNRTIDLTLSQRIRQFREMLISLVTAVDQLR CCUGNFYTHYHHSEIVIENKVLDFLNSSLVSTALHVKDKYLKTDKTKE 12901]FLKETIAAELDILIEAYKKKQIEKKNTRFKANKREDILNAIYNEA (SEQ IDFWSFINDKDKDKETVVAKGADAYFEKNHHKSNDPDFALNISEK No. 131)GIVYLLSFFLTNKEMDSLKANLTGFKGKVDRESGNSIKYMATQRIYSFHTYRGLKQKIRTSEEGVKETLLMQMIDELSKVPNVVYQHLSTTQQNSFIEDWNEYYKDYEDDVETDDLSRVIHPVIRKRYEDRFNYFAIRFLDEFFDFPTLRFQVHLGDYVHDRRTKQLGKVESDRIIKEKVTVFARLKDINSAKANYFHSLEEQDKEELDNKWTLFPNPSYDFPKEHTLQHQGEQKNAGKIGIYVKLRDTQYKEKAALEEARKSLNPKERSATKASKYDIITQIIEANDNVKSEKPLVFTGQPIAYLSMNDIHSMLFSLLTDNAELKKTPEEVEAKLIDQIGKQINEILSKDTDTKILKKYKDNDLKETDTDKITRDLARDKEEIEKLILEQKQRADDYNYTSSTKFNIDKSRKRKHLLFNAEKGKIGVWLANDIKRFMTEEFKSKWKGYQHTELQKLFAYYDTSKSDLDLILSDMVMVKDYPIELIALVKKSRTLVDFLNKYLEARLGYMENVITRVKNSIGTPQFKTVRKECFTFLKKSNYTVVSLDKQVERILSMPLFIERGFMDDKPTMLEGKSYQQHKEKFADWFVHYKENSNYQNFYDTEVYEITTEDKREKAKVTKKIKQQQKNDVFTLMMVNYMLEEVLKLSSNDRLSLNELYQTKEERIVNKQVAKDTQERNKNYIWNKVVDLQLCEGLVRIDKVKLKDIGNFRKYENDSRVKEFLTYQSDIVWSAYLSNEVDSNKLYVIERQLDNYESIRSKELLKEVQEIECSVYNQVANKESLKQSGNENFKQYVLQGLVPIGMDVREMLILSTDVKFIKEEIIQLGQAGEVEQDLYSLIYIRNKFAHNQLPIKEFFDFCENNYRSISDNEY YAEYYMEIFRSIKEKYTSHMPREF9711_00870 EKB06014 MKDILTTDTTEKQNRFYSHKIADKYFFGGYFNLASNNIYEVFEE[Myroides VNKRNTFGKLAKRDNGNLKNYIIHVFKDELSISDFEKRVAIFAS odoratimimusYFPILETVDKKSIKERNRTIDLTLSQRIRQFREMLISLVTAVDQLR CCUGNFYTHYHHSEIVIENKVLDFLNSSLVSTALHVKDKYLKTDKTKE 3837]FLKETIAAELDILIEAYKKKQIEKKNTRFKANKREDILNAIYNEA (SEQ IDFWSFINDKDKDKETVVAKGADAYFEKNHHKSNDPDFALNISEK No. 132)GIVYLLSFFLTNKEMDSLKANLTGFKGKVDRESGNSIKYMATQRIYSFHTYRGLKQKIRTSEEGVKETLLMQMIDELSKVPNVVYQHLSTTQQNSFIEDWNEYYKDYEDDVETDDLSRVIHPVIRKRYEDRFNYFAIRFLDEFFDFPTLRFQVHLGDYVHDRRTKQLGKVESDRIIKEKVTVFARLKDINSAKASYFHSLEEQDKEELDNKWTLFPNPSYDFPKEHTLQHQGEQKNAGKIGIYVKLRDTQYKEKAALEEARKSLNPKERSATKASKYDIITQIIEANDNVKSEKPLVFTGQPIAYLSMNDIHSMLFSLLTDNAELKKTPEEVEAKLIDQIGKQINEILSKDTDTKILKKYKDNDLKETDTDKITRDLARDKEEIEKLILEQKQRADDYNYTSSTKFNIDKSRKRKHLLFNAEKGKIGVWLANDIKRFMFKESKSKWKGYQHTELQKLFAYFDTSKSDLELILSDMVMVKDYPIELIDLVRKSRTLVDFLNKYLEARLGYIENVITRVKNSIGTPQFKTVRKECFAFLKESNYTVASLDKQIERILSMPLFIERGFMDSKPTMLEGKSYQQHKEDFADWFVHYKENSNYQNFYDTEVYEIITEDKREQAKVTKKIKQQQKNDVFTLMMVNYMLEEVLKLPSNDRLSLNELYQTKEERIVNKQVAKDTQERNKNYIWNKVVDLQLCEGLVRIDKVKLKDIGNFRKYENDSRVKEFLTYQSDIVWSGYLSNEVDSNKLYVIERQLDNYESIRSKELLKEVQEIECIVYNQVANKESLKQSGNENFKQYVLQGLLPRGTDVREMLILSTDVKFKKEEIMQLGQVREVEQDLYSLIYIRNKFAHNQLPIKEFFDFCENNYRPISDNEYYAEY YMEIFRSIKEKYASHMPREF9699_02005 EKB54193 MENKTSLGNNIYYNPFKPQDKSYFAGYFNAAMENTDSVFRELG[Bergeyella KRLKGKEYTSENFFDAIFKENISLVEYERYVKLLSDYFPMARLL zoohelcumDKKEVPIKERKENFKKNFKGIIKAVRDLRNFYTHKEHGEVEITD ATCCEIFGVLDEMLKSTVLTVKKKKVKTDKTKEILKKSIEKQLDILCQ 43767]KKLEYLRDTARKIEEKRRNQRERGEKELVAPFKYSDKRDDLIA (SEQ IDAIYNDAFDVYIDKKKDSLKESSKAKYNTKSDPQQEEGDLKIPIS No. 133)KNGVVFLLSLFLTKQEIHAFKSKIAGFKATVIDEATVSEATVSHGKNSICFMATHEIFSHLAYKKLKRKVRTAEINYGEAENAEQLSVYAKETLMMQMLDELSKVPDVVYQNLSEDVQKTFIEDWNEYLKENNGDVGTMEEEQVIHPVIRKRYEDKFNYFAIRFLDEFAQFPTLRFQVHLGNYLHDSRPKENLISDRRIKEKITVFGRLSELEHKKALFIKNTETNEDREHYWEIFPNPNYDFPKENISVNDKDFPIAGSILDREKQPVAGKIGIKVKLLNQQYVSEVDKAVKAHQLKQRKASKPSIQNIIEEIVPINESNPKEAIVFGGQPTAYLSMNDIHSILYEFFDKWEKKKEKLEKKGEKELRKEIGKELEKKIVGKIQAQIQQIIDKDTNAKILKPYQDGNSTAIDKEKLIKDLKQEQNILQKLKDEQTVREKEYNDFIAYQDKNREINKVRDRNHKQYLKDNLKRKYPEAPARKEVLYYREKGKVAVWLANDIKRFMPTDFKNEWKGEQHSLLQKSLAYYEQCKEELKNLLPEKVFQHLPFKLGGYFQQKYLYQFYTCYLDKRLEYISGLVQQAENFKSENKVFKKVENECFKFLKKQNYTHKELDARVQSILGYPIFLERGFMDEKPTIIKGKTFKGNEALFADWFRYYKEYQNFQTFYDTENYPLVELEKKQADRKRKTKIYQQKKNDVFTLLMAKHIFKSVFKQDSIDQFSLEDLYQSREERLGNQERARQTGERNTNYIWNKTVDLKLCDGKITVENVKLKNVGDFIKYEYDQRVQAFLKYEENIEWQAFLIKESKEEENYPYVVEREIEQYEKVRREELLKEVHLIEEYILEKVKDKEILKKGDNQNFKYYILNGLLKQLKNEDVESYKVFNLNTEPEDVNINQLKQEATDLEQKAFVLTYIRNKFAHNQLPKKEFWDYCQEKYGKIEKEKTYAEYFAEVFKKEKE ALIK HMPREF9151_01387EKY00089 MMEKENVQGSHIYYEPTDKCFWAAFYNLARHNAYLTIAHINSF [PrevotellaVNSKKGINNDDKVLDIIDDWSKFDNDLLMGARLNKLILKHFPFL saccharolyticaKAPLYQLAKRKTRKQQGKEQQDYEKKGDEDPEVIQEAIANAFK F0055]MANVRKTLHAFLKQLEDLRNHFSHYNYNSPAKKMEVKFDDGF (SEQ IDCNKLYYVFDAALQMVKDDNRMNPEINMQTDFEHLVRLGRNR No. 134)KIPNTFKYNFTNSDGTINNNGLLFFVSLFLEKRDAIWMQKKIKGFKGGTENYMRMTNEVFCRNRMVIPKLRLETDYDNHQLMFDMLNELVRCPLSLYKRLKQEDQDKFRVPIEFLDEDNEADNPYQENANSDENPTEETDPLKNTLVRHQHRFPYFVLRYFDLNEVFKQLRFQINLGCYHFSIYDKTIGERTEKRHLTRTLFGFDRLQNFSVKLQPEHWKNMVKHLDTEESSDKPYLSDAMPHYQIENEKIGIHFLKTDTEKKETVWPSLEVEEVSSNRNKYKSEKNLTADAFLSTHELLPMMFYYQLLSSEEKTRAAAGDKVQGVLQSYRKKIFDIYDDFANGTINSMQKLDERLAKDNLLRGNMPQQMLAILEHQEPDMEQKAKEKLDRLITETKKRIGKLEDQFKQKVRIGKRRADLPKVGSIADWLVNDMMRFQPAKRNADNTGVPDSKANSTEYRLLQEALAFYSAYKDRLEPYFRQVNLIGGTNPHPFLHRVDWKKCNHLLSFYHDYLEAKEQYLSHLSPADWQKHQHFLLLKVRKDIQNEKKDWKKSLVAGWKNGFNLPRGLFTESIKTWFSTDADKVQITDTKLFENRVGLIAKLIPLYYDKVYNDKPQPFYQYPFNINDRYKPEDTRKRFTAASSKLWNEKKMLYKNAQPDSSDKIEYPQYLDFLSWKKLERELRMLRNQDMMVWLMCKDLFAQCTVEGVEFADLKLSQLEVDVNVQDNLNVLNNVSSMILPLSVYPSDAQGNVLRNSKPLHTVYVQENNTKLLKQGNFKSLLKDRRLNGLFSFIAAEGEDLQQHPLTKNRLEYELSIYQTMRISVFEQTLQLEKAILTRNKTLCGNNFNNLLNSWSEHRTDKKTLQPDIDFLIAVRNAFSHNQYPMSTNTVMQGIEKFNIQTPKLEEKDGLGIASQLAKKTKDAASRLQNIINGGTN A3431752 EOA10535MTEQNEKPYNGTYYTLEDKHFWAAFFNLARHNAYITLTHIDRQ [PorphyromonasLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF gingivalisLEGAAYGKKLFESQSSGNKSSKKKELTKKEKEELQANALSLDN JCVILKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNV SC001]FDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRCGNNDN (SEQ IDPFFKHHFVDREEKVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKG No. 135)GTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRARFRVPVDILSDEDDTDGTEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQLLWPSPEVGATRTGRSKYAQDKRFTAEAFLSVHELMPMMFYYFLLREKYSEEASAERVQGRIKRVIEDVYAVYDAFARGEIDTLDRLDACLADKGIRRGHLPRQMIAILSQEHKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPETPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDRVENHRFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGLDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRDLEAWSHSAARRIEDAFAGIENASRENKKKIEQLLQDLSLWETFESKLKVKADKINIAKLKKEILEAKEHPYLDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTDVHEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLICQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDKNFRKMLESWSDPLLDKWPDLHGNVRLLIAVRNAFSHNQYPMYDETLFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAK EMVERIIQA HMPREF1981_03090ERI81700 MESIKNSQKSTGKTLQKDPPYFGLYLNMALLNVRKVENHIRKW [BacteroidesLGDVALLPEKSGFHSLLTTDNLSSAKWTRFYYKSRKFLPFLEMF pyogenesDSDKKSYENRRETTECLDTIDRQKISSLLKEVYGKLQDIRNAFS F0041]HYHIDDQSVKHTALIISSEMHRFIENAYSFALQKTRARFTGVFVE (SEQ IDTDFLQAEEKGDNKKFFAIGGNEGIKLKDNALIFLICLFLDREEAF No. 136)KFLSRATGFKSTKEKGFLAVRETFCALCCRQPHERLLSVNPREALLMDMLNELNRCPDILFEMLDEKDQKSFLPLLGEEEQAHILENSLNDELCEAIDDPFEMIASLSKRVRYKNRFPYLMLRYIEEKNLLPFIRFRIDLGCLELASYPKKMGEENNYERSVTDHAMAFGRLTDFHNEDAVLQQITKGITDEVRFSLYAPRYAIYNNKIGFVRTGGSDKISFPTLKKKGGEGHCVAYTLQNTKSFGFISIYDLRKILLLSFLDKDKAKNIVSGLLEQCEKHWKDLSENLFDAIRTELQKEFPVPLIRYTLPRSKGGKLVSSKLADKQEKYESEFERRKEKLTEILSEKDFDLSQIPRRMIDEWLNVLPTSREKKLKGYVETLKLDCRERLRVFEKREKGEHPVPPRIGEMATDLAKDIIRMVIDQGVKQRITSAYYSEIQRCLAQYAGDDNRRHLDSIIRELRLKDTKNGHPFLGKVLRPGLGHTEKLYQRYFEEKKEWLEATFYPAASPKRVPRFVNPPTGKQKELPLIIRNLMKERPEWRDWKQRKNSHPIDLPSQLFENEICRLLKDKIGKEPSGKLKWNEMFKLYWDKEFPNGMQRFYRCKRRVEVFDKVVEYEYSEEGGNYKKYYEALIDEVVRQKISSSKEKSKLQVEDLTLSVRRVFKRAINEKEYQLRLLCEDDRLLFMAVRDLYDWKEAQLDLDKIDNMLGEPVSVSQVIQLEGGQPDAVIKAECKLKDVSKLMRYCYDGRVKGLMPYFANHEATQEQVEMELRHYEDHRRRVFNWVFALEKSVLKNEKLRRFYEESQGGCEHRRCIDALRKASLVSEEEYEFLVHIRNKSAHNQFPDLEIGKLPPNVTSGFCECIWSKYKAIICRI IPFIDPERRFFGKLLEQKHMPREF1553_02065 ERJ65637 MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK[Porphyromonas FGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF gingivalisDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL F0568]DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLA (SEQ IDKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGF No. 137)KRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPRSMGFISVHDLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMDQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLQKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRHQFRAIVAELRLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL HMPREF1988_01768 ERJ81987MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK [PorphyromonasFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF gingivalisDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL F0185]DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLA (SEQ IDKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGF No. 138)KRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPQSMGFISVHDLRKLLLMELLCEGSFSRMQSGFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMNQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELHLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDHENRFFGKLLNNMSQPINDL HMPREF1990_01800 ERJ87335MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK [PorphyromonasFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF gingivalisDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL W4087]DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLA (SEQ IDKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGF No. 139)KRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPRSMGFISVHDLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMDQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLQKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRHQFRAIVAELRLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKVMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVRDKKRELRTAGKPVPPDLAAYIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKIMTDREEDILPGLKNIDSILDKENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEIPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL M573_117042 KJJ86756MKMEDDKKTTESTNMLDNKHFWAAFLNLARHNVYITVNHINK [PrevotellaVLELKNKKDQDIIIDNDQDILAIKTHWEKVNGDLNKTERLRELM intermediaTKHFPFLETAIYTKNKEDKEEVKQEKQAEAQSLESLKDCLFLFL ZT] (SEQEKLQEARNYYSHYKYSESTKEPMLEEGLLEKMYNIFDDNIQLVI ID No.KDYQHNKDINPDEDFKHLDRKGQFKYSFADNEGNITESGLLFF 140)VSLFLEKKDAIWMQQKLTGFKDNRESKKKMTHEVFCRRRMLLPKLRLESTQTQDWILLDMLNELIRCPKSLYERLQGEYRKKFNVPFDSADEDYDAEQEPFKNTLVRHQDREPYFALRYFDYNEIFTNLRFQIDLGTYHFSIYKKLIGGQKEDRHLTHKLYGFERIQEFAKQNRPDEWKALVKDLDTYETSNERYISETTPHYHLENQKIGIRFRNGNKEIWPSLKTNGENNEKSKYKLDKPYQAEAFLSVHELLPMMFYYLLLKKEEPNNDKKNASIVEGFIKREIRDMYKLYDAFANGEINNIGDLEKYCEDKGIPKRHLPKQMVAILYDEPKDMVKEAKRKQKEMVKDTKKLLATLEKQTQEEIEDGGRNIRLLKSGEIARWLVNDMMRFQPVQKDNEGNPLNNSKANSTEYQMLQRSLALYNKEEKPTRYFRQVNLINSSNPHPFLKWTKWEECNNILSFYRNYLTKKIEFLNKLKPEDWEKNQYFLKLKEPKTNRETLVQGWKNGFNLPRGIFTEPIREWFKRHQNDSKEYEKVEALKRVGLVTKVIPLEFKEEYEKEDAQKEINNCVQPFYSFPYNVGNIHKPDEKDFLPSEERKKLWGDKKDKFKGYKAKVKSKKLTDKEKEEYRSYLEFQSWNKFERELRLVRNQDIVTWLLCTELIDKMKVEGLNVEELQKLRLKDIDTDTAKQEKNNILNRIMPMQLPVTVYEIDDSHNIVKDRPLHTVYIEETKTKLLKQGNFKALVKDRRLNGLFSFVDTSSKAELKDKPISKSVVEYELGEYQNARIETIKDMLLLEKTLIKKYEKLPTDNFSDMLNGWLEGKDESDKARFQNDVKLLVAVRNAFSHNQYPMRNRIAFANINPFSLSSADISEEKKLDIANQLKDKTHKIIKKIIEIEKPIETKE A2033_10205 OFX18020.1MENQTQKGKGIYYYYTKNEDKHYFGSFLNLANNNIEQIIEEFRI [BacteroidetesRLSLKDEKNIKEIINNYFTDKKSYTDWERGINILKEYLPVIDYLD bacteriumLAITDKEFEKIDLKQKETAKRKYFRTNFSLLIDTIIDLRNFYTHYF GWA2_31_9]HKPISINPDVAKFLDKNLLNVCLDIKKQKMKTDKTKQALKDGL (SEQDKELKKLIELKKAELKEKKIKTWNITENVEGAVYNDAFNHMVY ID No.KNNAGVTILKDYHKSILPDDKIDSELKLNFSISGLVFLLSMFLSK 141)KEIEQFKSNLEGFKGKVIGENGEYEISKFNNSLKYMATHWIFSYLTFKGLKQRVKNTEDKETLLMQMIDELNKVPHEVYQTLSKEQQNEFLEDINEYVQDNEENKKSMENSIVVHPVIRKRYDDKENYFAIRELDEFANEPTLKFFVTAGNEVHDKREKQIQGSMLTSDRMIKEKINVFGKLTEIAKYKSDYFSNENTLETSEWELFPNPSYLLIQNNIPVHIDLIHNTEEAKQCQIAIDRIKCTTNPAKKRNTRKSKEEIIKIIYQKNKNIKYGDPTALLSSNELPALIYELLVNKKSGKELENIIVEKIVNQYKTIAGFEKGQNLSNSLITKKLKKSEPNEDKINAEKIILAINRELEITENKLNIIKNNRAEFRTGAKRKHIFYSKELGQEATWIAYDLKRFMPEASRKEWKGEHHSELQKFLAFYDRNKNDAKALLNMFWNFDNDQLIGNDLNSAFREFHFDKEYEKYLIKRDEILEGFKSFISNFKDEPKLLKKGIKDIYRVEDKRYYIIKSTNAQKEQLLSKPICLPRGIFDNKPTYIEGVKVESNSALFADWYQYTYSDKHEFQSFYDMPRDYKEQFEKFELNNIKSIQNKKNLNKSDKFIYFRYKQDLKIKQIKSQDLFIKLMVDELENVVEKNNIELNLKKLYQTSDERFKNQLIADVQKNREKGDTSDNKMNENFIWNMTIPLSLCNGQIEEPKVKLKDIGKFRKLETDDKVIQLLEYDKSKVWKKLEIEDELENMPNSYERIRREKLLKGIQEFEHFLLEKEKEDGINHPKHFEQDLNPNEKTYVINGVLRKNSKLNYTEIDKLLDLEHISIKDIETSAKEIHLAYFLIHVRNKFGHNQLPKLEAFELMKKYYKKNNEETYAEYFHKVSSQIVNEF KNSLEKHS SAMN05421542_0666SDI27289.1 MEKTQTGLGIYYDHTKLQDKYFFGGFFNLAQNNIDNVIKAFIIK[Chryseobacterium FFPERKDKDINIAQFLDICFKDNDADSDFQKKNKFLRIHFPVIGF jejuense] LTSDNDKAGFKKKFALLLKTISELRNFYTHYYHKSIEFPSELFEL (SEQ IDLDDIFVKTTSEIKKLKKKDDKTQQLLNKNLSEEYDIRYQQQIER No. 142)LKELKAQGKRVSLTDETAIRNGVFNAAFNHLIYRDGENVKPSRLYQSSYSEPDPAENGISLSQNSILFLLSMFLERKETEDLKSRVKGFKAKIIKQGEEQISGLKFMATHWVFSYLCFKGIKQKLSTEFHEETLLIQIIDELSKVPDEVYSAFDSKTKEKFLEDINEYMKEGNADLSLEDSKVIHPVIRKRYENKFNYFAIRFLDEYLSSTSLKFQVHVGNYVHDRRVKHINGTGFQTERIVKDRIKVFGRLSNISNLKADYIKEQLELPNDSNGWEIFPNPSYIFIDNNVPIHVLADEATKKGIELFKDKRRKEQPEELQKRKGKISKYNIVSMIYKEAKGKDKLRIDEPLALLSLNEIPALLYQILEKGATPKDIELIIKNKLTERFEKIKNYDPETPAPASQISKRLRNNTTAKGQEALNAEKLSLLIEREIENTETKLSSIEEKRLKAKKEQRRNTPQRSIFSNSDLGRIAAWLADDIKRFMPAEQRKNWKGYQHSQLQQSLAYFEKRPQEAFLLLKEGWDTSDGSSYWNNWVMNSFLENNHFEKFYKNYLMKRVKYFSELAGNIKQHTHNTKFLRKFIKQQMPADLFPKRHYILKDLETEKNKVLSKPLVFSRGLFDNNPTFIKGVKVTENPELFAEWYSYGYKTEHVFQHFYGWERDYNELLDSELQKGNSFAKNSIYYNRESQLDLIKLKQDLKIKKIKIQDLFLKRIAEKLFENVFNYPTTLSLDEFYLTQEERAEKERIALAQSLREEGDNSPNIIKDDFIWSKTIAFRSKQIYEPAIKLKDIGKFNRFVLDDEESKASKLLSYDKNKIWNKEQLERELSIGENSYEVIRREKLFKEIQNLELQILSNWSWDGINHPREFEMEDQKNTRHPNFKMYLVNGILRKNINLYKEDEDFWLESLKENDFKTLPSEVLETKSEMVQLLFLVILIRNQFAHNQLPEIQFYNFIRKNYPEIQNNTVAELYLNLI KLAVQKLKDNSSAMN05444360_11366 SHM52812.1MNTRVTGMGVSYDHTKKEDKHFFGGFLNLAQDNITAVIKAFCI [ChryseobacteriumKFDKNPMSSVQFAESCFTDKDSDTDFQNKVRYVRTHLPVIGYL carnipullorum]NYGGDRNTFRQKLSTLLKAVDSLRNFYTHYYHSPLALSTELFEL (SEQLDTVFASVAVEVKQHKMKDDKTRQLLSKSLAEELDIRYKQQLE ID No.RLKELKEQGKNIDLRDEAGIRNGVLNAAFNHLIYKEGEIAKPTL 143)SYSSFYYGADSAENGITISQSGLLFLLSMFLGKKEIEDLKSRIRGFKAKIVRDGEENISGLKFMATHWIFSYLSFKGMKQRLSTDFHEETLLIQIIDELSKVPDEVYHDFDTATREKFVEDINEYIREGNEDFSLGDSTIIHPVIRKRYENKFNYFAVRFLDEFIKFPSLRFQVHLGNFVHDRRIKDIHGTGFQTERVVKDRIKVFGKLSETSSLKTEYIEKELDLDSDTGWEIFPNPSYVFIDNNIPIYISTNKTFKNGSSEFIKLRRKEKPEEMKMRGEDKKEKRDIASMIGNAGSLNSKTPLAMLSLNEMPALLYEILVKKTTPEEIELIIKEKLDSHFENIKNYDPEKPLPASQISKRLRNNTTDKGKKVINPEKLIHLINKEIDATEAKFALLAKNRKELKEKFRGKPLRQTIFSNMELGREATWLADDIKRFMPDILRKNWKGYQHNQLQQSLAFFNSRPKEAFTILQDGWDFADGSSFWNGWIINSFVKNRSFEYFYEAYFEGRKEYFSSLAENIKQHTSNHRNLRRFIDQQMPKGLFENRHYLLENLETEKNKILSKPLVFPRGLFDTKPTFIKGIKVDEQPELFAEWYQYGYSTEHVFQNFYGWERDYNDLLESELEKDNDFSKNSIHYSRTSQLELIKLKQDLKIKKIKIQDLFLKLIAGHIFENIFKYPASFSLDELYLTQEERLNKEQEALIQSQRKEGDHSDNIIKDNFIGSKTVTYESKQISEPNVKLKDIGKFNRFLLDDKVKTLLSYNEDKVWNKNDLDLELSIGENSYEVIRREKLFKKIQNFELQTLTDWPWNGTDHPEEFGTTDNKGVNHPNFKMYVVNGILRKHTDWFKEGEDNWLENLNETHFKNLSFQELETKSKSIQTAFLIIMIRNQFAHNQLPAVQFFEFIQKKYPEIQGSTTSELYLNFINLAVVELLELL EK SAMN05421786_1011119SIS70481.1 METQILGNGISYDHTKTEDKHFFGGFLNTAQNNIDLLIKAYISKF[Chryseobacterium ESSPRKLNSVQFPDVCFKKNDSDADFQHKLQFIRKHLPVIQYLKureilyticum] YGGNREVLKEKFRLLLQAVDSLRNFYTHFYHKPIQLPNELLTLL (SEQ IDDTIFGEIGNEVRQNKMKDDKTRHLLKKNLSEELDFRYQEQLER No. 144)LRKLKSEGKKVDLRDTEAIRNGVLNAAFNHLIFKDAEDFKPTVSYSSYYYDSDTAENGISISQSGLLFLLSMFLGRREMEDLKSRVRGFKARIIKHEEQHVSGLKFMATHWVFSEFCFKGIKTRLNADYHEETLLIQLIDELSKVPDELYRSFDVATRERFIEDINEYIRDGKEDKSLIESKIVHPVIRKRYESKFNYFAIRFLDEFVNFPTLRFQVHAGNYVHDRRIKSIEGTGFKTERLVKDRIKVFGKLSTISSLKAEYLAKAVNITDDTGWELLPHPSYVFIDNNIPIHLTVDPSFKNGVKEYQEKRKLQKPEEMKNRQGGDKMHKPAISSKIGKSKDINPESPVALLSMNEIPALLYEILVKKASPEEVEAKIRQKLTAVFERIRDYDPKVPLPASQVSKRLRNNTDTLSYNKEKLVELANKEVEQTERKLALITKNRRECREKVKGKFKRQKVFKNAELGTEATWLANDIKRFMPEEQKKNWKGYQHSQLQQSLAFFESRPGEARSLLQAGWDFSDGSSFWNGWVMNSFARDNTFDGFYESYLNGRMKYFLRLADNIAQQSSTNKLISNFIKQQMPKGLFDRRLYMLEDLATEKNKILSKPLIFPRGIFDDKPTFKKGVQVSEEPEAFADWYSYGYDVKHKFQEFYAWDRDYEELLREELEKDTAFTKNSIHYSRESQIELLAKKQDLKVKKVRIQDLYLKLMAEFLFENVFGHELALPLDQFYLTQEERLKQEQEAIVQSQRPKGDDSPNIVKENFIWSKTIPFKSGRVFEPNVKLKDIGKFRNLLTDEKVDILLSYNNTEIGKQVIENELIIGAGSYEFIRREQLFKEI QQMKRLSLRSVRGMGVPIRLNLKPrevotella WP_004343581 MQKQDKLFVDRKKNAIFAFPKYITIMENQEKPEPIYYELTDKHFbuccae WAAFLNLARHNVYTTINHINRRLEIAELKDDGYMMDIKGSWNE (SEQ IDQAKKLDKKVRLRDLIMKHFPFLEAAAYEITNSKSPNNKEQREK No. 145)EQSEALSLNNLKNVLFIFLEKLQVLRNYYSHYKYSEESPKPIFETSLLKNMYKVFDANVRLVKRDYMHHENIDMQRDFTHLNRKKQVGRTKNIIDSPNFHYHFADKEGNMTIAGLLFFVSLFLDKKDAIWMQKKLKGFKDGRNLREQMTNEVFCRSRISLPKLKLENVQTKDWMQLDMLNELVRCPKSLYERLREKDRESFKVPFDIFSDDYDAEEEPFKNTLVRHQDRFPYFVLRYFDLNEIFEQLRFQIDLGTYHFSIYNKRIGDEDEVRHLTHHLYGFARIQDFAQQNQPEVWRKLVKDLDYFEASQEPYIPKTAPHYHLENEKIGIKFCSTHNNLFPSLKTEKTCNGRSKFNLGTQFTAEAFLSVHELLPMMFYYLLLTKDYSRKESADKVEGIIRKEISNIYAIYDAFANGEINSIADLTCRLQKTNILQGHLPKQMISILEGRQKDMEKEAERKIGEMIDDTQRRLDLLCKQTNQKIRIGKRNAGLLKSGKIADWLVNDMMRFQPVQKDQNNIPINNSKANSTEYRMLQRALALFGSENFRLKAYFNQMNLVGNDNPHPFLAETQWEHQTNILSFYRNYLEARKKYLKGLKPQNWKQYQHFLILKVQKTNRNTLVTGWKNSFNLPRGIFTQPIREWFEKHNNSKRIYDQILSFDRVGFVAKAIPLYFAEEYKDNVQPFYDYPFNIGNKLKPQKGQFLDKKERVELWQKNKELFKNYPSEKKKTDLAYLDFLSWKKFERELRLIKNQDIVTWLMFKELFNMATVEGLKIGEIHLRDIDTNTANEESNNILNRIMPMKLPVKTYETDNKGNILKERPLATFYIEETETKVLKQGNFKVLAKDRRLNGLLSFAETTDIDLEKNPITKLSVDHELIKYQTTRISIFEMTLGLEKKLINKYPTLPTDSFRNMLERWLQCKANRPELKNYVNSLIAVRNAFSHNQYPMYDATLFAEVKKFTLFPSVDTKKIELNIAPQLLEIVGKAIKEIEKSENKN Porphyromonas WP_005873511MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 146)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLAKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPQSMGFISVHNLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMNQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELHLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL Porphyromonas WP_005874195MTEQNEKPYNGTYYTLEDKHFWAAFFNLARHNAYITLAHIDRQ gingivalisLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF (SEQ IDLEGAAYGKKLFESQSSGNKSSKKKELTKKEKEELQANALSLDN No. 147)LKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDKYGNNDNPFFKHHFVDREEKVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTEAYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRARFRVPVDILSDEDDTDGTEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQLLWPSPEVGATRTGRSKYAQDKRFTAEAFLSVHELMPMMFYYFLLREKYSEEASAEKVQGRIKRVIEDVYAVYDAFARDEINTRDELDACLADKGIRRGHLPRQMIAILSQEHKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDREENHRFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRDLEAWSHSAARRIEDAFVGIEYASWENKKKIEQLLQDLSLWETFESKLKVKADKINIAKLKKEILEAKEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTDVQEQGSLNVLNHVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDESFREMLESWSDPLLDKWPDLQREVRLLIAVRNAFSHNQYPMYDETIFSSIRKYDPSSLDAIEERMGLNIAHRLSEEVKLAKEMV ERIIQA PrevotellaWP_006044833 MKEEEKGKTPVVSTYNKDDKHFWAAFLNLARHNVYITVNHIN pallensKILGEGEINRDGYENTLEKSWNEIKDINKKDRLSKLIIKHFPFLE (SEQ IDVTTYQRNSADTTKQKEEKQAEAQSLESLKKSFFVFIYKLRDLRN No. 148)HYSHYKHSKSLERPKFEEDLQEKMYNIFDASIQLVKEDYKHNTDIKTEEDFKHLDRKGQFKYSFADNEGNITESGLLFFVSLFLEKKDAIWVQKKLEGFKCSNESYQKMTNEVFCRSRMLLPKLRLQSTQTQDWILLDMLNELIRCPKSLYERLREEDRKKFRVPIEIADEDYDAEQEPFKNALVRHQDRFPYFALRYFDYNEIFTNLRFQIDLGTYHFSIYKKQIGDYKESHHLTHKLYGFERIQEFTKQNRPDEWRKFVKTFNSFETSKEPYIPETTPHYHLENQKIGIRFRNDNDKIWPSLKTNSEKNEKSKYKLDKSFQAEAFLSVHELLPMMFYYLLLKTENTDNDNEIETKKKENKNDKQEKHKIEEIIENKITEIYALYDAFANGKINSIDKLEEYCKGKDIEIGHLPKQMIAILKSEHKDMATEAKRKQEEMLADVQKSLESLDNQINEEIENVERKNSSLKSGEIASWLVNDMMRFQPVQKDNEGNPLNNSKANSTEYQMLQRSLALYNKEEKPTRYFRQVNLIESSNPHPFLNNTEWEKCNNILSFYRSYLEAKKNFLESLKPEDWEKNQYFLMLKEPKTNCETLVQGWKNGFNLPRGIFTEPIRKWFMEHRKNITVAELKRVGLVAKVIPLFFSEEYKDSVQPFYNYLFNVGNINKPDEKNFLNCEERRELLRKKKDEFKKMTDKEKEENPSYLEFQSWNKFERELRLVRNQDIVTWLLCMELFNKKKIKELNVEKIYLKNINTNTTKKEKNTEEKNGEEKIIKEKNNILNRIMPMRLPIKVYGRENFSKNKKKKIRRNTFFTVYIEEKGTKLLKQGNFKALERDRRLGGLFSFVKTHSKAESKSNTISKSRVEYELGEYQKARIEIIKDMLALEETLIDKYNSLDTDNFHNMLTGWLKLKDEPDKASFQNDVDLLIAVRNAFSHNQYPMRNRIAFANINPFSLSSANTSEEKGLGIANQLKDKTHKTIEKIIEIEKPIETKE Myroides WP_006261414MKDILTTDTTEKQNRFYSHKIADKYFFGGYFNLASNNIYEVFEE odoratimimusVNKRNTFGKLAKRDNGNLKNYIIHVFKDELSISDFEKRVAIFAS (SEQ IDYFPILETVDKKSIKERNRTIDLTLSQRIRQFREMLISLVTAVDQLR No. 149)NFYTHYHHSEIVIENKVLDFLNSSLVSTALHVKDKYLKTDKTKEFLKETIAAELDILIEAYKKKQIEKKNTRFKANKREDILNAIYNEAFWSFINDKDKDKETVVAKGADAYFEKNHHKSNDPDFALNISEKGIVYLLSFFLTNKEMDSLKANLTGFKGKVDRESGNSIKYMATQRIYSFHTYRGLKQKIRTSEEGVKETLLMQMIDELSKVPNVVYQHLSTTQQNSFIEDWNEYYKDYEDDVETDDLSRVIHPVIRKRYEDRFNYFAIRFLDEFFDFPTLRFQVHLGDYVHDRRTKQLGKVESDRIIKEKVTVFARLKDINSAKANYFHSLEEQDKEELDNKWTLFPNPSYDFPKEHTLQHQGEQKNAGKIGIYVKLRDTQYKEKAALEEARKSLNPKERSATKASKYDIITQIIEANDNVKSEKPLVFTGQPIAYLSMNDIHSMLFSLLTDNAELKKTPEEVEAKLIDQIGKQINEILSKDTDTKILKKYKDNDLKETDTDKITRDLARDKEEIEKLILEQKQRADDYNYTSSTKFNIDKSRKRKHLLFNAEKGKIGVWLANDIKRFMTEEFKSKWKGYQHTELQKLFAYYDTSKSDLDLILSDMVMVKDYPIELIALVKKSRTLVDFLNKYLEARLGYMENVITRVKNSIGTPQFKTVRKECFTFLKKSNYTVVSLDKQVERILSMPLFIERGFMDDKPTMLEGKSYQQHKEKFADWFVHYKENSNYQNFYDTEVYEITTEDKREKAKVTKKIKQQQKNDVFTLMMVNYMLEEVLKLSSNDRLSLNELYQTKEERIVNKQVAKDTQERNKNYIWNKVVDLQLCEGLVRIDKVKLKDIGNFRKYENDSRVKEFLTYQSDIVWSAYLSNEVDSNKLYVIERQLDNYESIRSKELLKEVQEIECSVYNQVANKESLKQSGNENFKQYVLQGLVPIGMDVREMLILSTDVKFIKEEIIQLGQAGEVEQDLYSLIYIRNKFAHNQLPIKEFFDFCENNYRSISDNEY YAEYYMEIFRSIKEKYTSMyroides WP_006265509 MKDILTTDTTEKQNRFYSHKIADKYFFGGYFNLASNNIYEVFEEodoratimimus VNKRNTFGKLAKRDNGNLKNYIIHVFKDELSISDFEKRVAIFAS (SEQ IDYFPILETVDKKSIKERNRTIDLTLSQRIRQFREMLISLVTAVDQLR No. 150)NFYTHYHHSEIVIENKVLDFLNSSLVSTALHVKDKYLKTDKTKEFLKETIAAELDILIEAYKKKQIEKKNTRFKANKREDILNAIYNEAFWSFINDKDKDKETVVAKGADAYFEKNHHKSNDPDFALNISEKGIVYLLSFFLTNKEMDSLKANLTGFKGKVDRESGNSIKYMATQRIYSFHTYRGLKQKIRTSEEGVKETLLMQMIDELSKVPNVVYQHLSTTQQNSFIEDWNEYYKDYEDDVETDDLSRVIHPVIRKRYEDRFNYFAIRFLDEFFDFPTLRFQVHLGDYVHDRRTKQLGKVESDRIIKEKVTVFARLKDINSAKASYFHSLEEQDKEELDNKWTLFPNPSYDFPKEHTLQHQGEQKNAGKIGIYVKLRDTQYKEKAALEEARKSLNPKERSATKASKYDIITQIIEANDNVKSEKPLVFTGQPIAYLSMNDIHSMLFSLLTDNAELKKTPEEVEAKLIDQIGKQINEILSKDTDTKILKKYKDNDLKETDTDKITRDLARDKEEIEKLILEQKQRADDYNYTSSTKFNIDKSRKRKHLLFNAEKGKIGVWLANDIKRFMFKESKSKWKGYQHTELQKLFAYFDTSKSDLELILSDMVMVKDYPIELIDLVRKSRTLVDFLNKYLEARLGYIENVITRVKNSIGTPQFKTVRKECFAFLKESNYTVASLDKQIERILSMPLFIERGFMDSKPTMLEGKSYQQHKEDFADWFVHYKENSNYQNFYDTEVYEIITEDKREQAKVTKKIKQQQKNDVFTLMMVNYMLEEVLKLPSNDRLSLNELYQTKEERIVNKQVAKDTQERNKNYIWNKVVDLQLCEGLVRIDKVKLKDIGNFRKYENDSRVKEFLTYQSDIVWSGYLSNEVDSNKLYVIERQLDNYESIRSKELLKEVQEIECIVYNQVANKESLKQSGNENFKQYVLQGLLPRGTDVREMLILSTDVKFKKEEIMQLGQVREVEQDLYSLIYIRNKFAHNQLPIKEFFDFCENNYRPISDNEYYAEY YMEIFRSIKEKYAS PrevotellaWP_007412163 MQKQDKLFVDRKKNAIFAFPKYITIMENQEKPEPIYYELTDKHF sp. MSX73WAAFLNLARHNVYTTINHINRRLEIAELKDDGYMMGIKGSWNE (SEQ IDQAKKLDKKVRLRDLIMKHFPFLEAAAYEITNSKSPNNKEQREK No. 151)EQSEALSLNNLKNVLFIFLEKLQVLRNYYSHYKYSEESPKPIFETSLLKNMYKVFDANVRLVKRDYMHHENIDMQRDFTHLNRKKQVGRTKNIIDSPNFHYHFADKEGNMTIAGLLFFVSLFLDKKDAIWMQKKLKGFKDGRNLREQMTNEVFCRSRISLPKLKLENVQTKDWMQLDMLNELVRCPKSLYERLREKDRESFKVPFDIFSDDYDAEEEPFKNTLVRHQDRFPYFVLRYFDLNEIFEQLRFQIDLGTYHFSIYNKRIGDEDEVRHLTHHLYGFARIQDFAPQNQPEEWRKLVKDLDHFETSQEPYISKTAPHYHLENEKIGIKFCSTHNNLFPSLKREKTCNGRSKFNLGTQFTAEAFLSVHELLPMMFYYLLLTKDYSRKESADKVEGIIRKEISNIYAIYDAFANNEINSIADLTCRLQKTNILQGHLPKQMISILEGRQKDMEKEAERKIGEMIDDTQRRLDLLCKQTNQKIRIGKRNAGLLKSGKIADWLVSDMMRFQPVQKDTNNAPINNSKANSTEYRMLQHALALFGSESSRLKAYFRQMNLVGNANPHPFLAETQWEHQTNILSFYRNYLEARKKYLKGLKPQNWKQYQHFLILKVQKTNRNTLVTGWKNSFNLPRGIFTQPIREWFEKHNNSKRIYDQILSFDRVGFVAKAIPLYFAEEYKDNVQPFYDYPFNIGNKLKPQKGQFLDKKERVELWQKNKELFKNYPSEKNKTDLAYLDFLSWKKFERELRLIKNQDIVTWLMFKELFKTTTVEGLKIGEIHLRDIDTNTANEESNNILNRIMPMKLPVKTYETDNKGNILKERPLATFYIEETETKVLKQGNFKVLAKDRRLNGLLSFAETTDIDLEKNPITKLSVDYELIKYQTTRISIFEMTLGLEKKLIDKYSTLPTDSFRNMLERWLQCKANRPELKNYVNSLIAVRNAFSHNQYPMYDATLFAEVKKFTLFPSVDTKKIELNIAPQLLEIVGKAIKEIEKSENKN Porphyromonas WP_012458414MTEQNERPYNGTYYTLEDKHFWAAFFNLARHNAYITLAHIDRQ gingivalisLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF (SEQ IDLEGAAYGKKLFESQSSGNKSSKKKELTKKEKEELQANALSLDN No. 152)LKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGNNDNPFFKHHFVDREEKVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRARFRVPVDILSDEDDTDGTEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGATRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSDEASAERVQGRIKRVIEDVYAVYDAFARGEINTRDELDACLADKGIRRGHLPRQMIGILSQEHKDMEEKVRKKLQEMIVDTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDRVENHRFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGLDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAKEHPYLDFKSWQKFERELRLVKNQDIITWMICRDLMEENKVEGLDTGTLYLKDIRTDVQEQGNLNVLNRVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDKNFRKMLESWSDPLLDKWPDLHGNVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKEMAERIIQA Paludibacter WP_013446107MKTSANNIYFNGINSFKKIFDSKGAIAPIAEKSCRNFDIKAQNDV propionicigenesNKEQRIHYFAVGHTFKQLDTENLFEYVLDENLRAKRPTRFISLQ (SEQQFDKEFIENIKRLISDIRNINSHYIHRFDPLKIDAVPTNIIDFLKESF ID No.ELAVIQIYLKEKGINYLQFSENPHADQKLVAFLHDKFLPLDEKK 153)TSMLQNETPQLKEYKEYRKYFKTLSKQAAIDQLLFAEKETDYIWNLFDSHPVLTISAGKYLSFYSCLFLLSMFLYKSEANQLISKIKGFKKNTTEEEKSKREIFTFFSKRFNSMDIDSEENQLVKFRDLILYLNHYPVAWNKDLELDSSNPAMTDKLKSKIIELEINRSFPLYEGNERFATFAKYQIWGKKHLGKSIEKEYINASFTDEEITAYTYETDTCPELKDAHKKLADLKAAKGLFGKRKEKNESDIKKTETSIRELQHEPNPIKDKLIQRIEKNLLTVSYGRNQDRFMDFSARFLAEINYFGQDASFKMYHFYATDEQNSELEKYELPKDKKKYDSLKFHQGKLVHFISYKEHLKRYESWDDAFVIENNAIQLKLSFDGVENTVTIQRALLIYLLEDALRNIQNNTAENAGKQLLQEYYSHNKADLSAFKQILTQQDSIEPQQKTEFKKLLPRRLLNNYSPAINHLQTPHSSLPLILEKALLAEKRYCSLVVKAKAEGNYDDFIKRNKGKQFKLQFIRKAWNLMYFRNSYLQNVQAAGHHKSFHIERDEFNDFSRYMFAFEELSQYKYYLNEMFEKKGFFENNEFKILFQSGTSLENLYEKTKQKFEIWLASNTAKTNKPDNYHLNNYEQQFSNQLFFINLSHFINYLKSTGKLQTDANGQIIYEALNNVQYLIPEYYYTDKPERSESKSGNKLYNKLKATKLEDALLYEMAMCYLKADKQIADKAKHPITKLLTSDVEFNITNKEGIQLYHLLVPFKKIDAFIGLKMHKEQQDKKHPTSFLANIVNYLELVKNDKDIRKTYEAFSTNPVKRTLTYDDLAKIDGHLISKSIKFTNVTLELERYFIFKESLIVKKGNNIDFKYIKGLRNYYNNEKKKNEGIRNKAFHFGIPDSKSYDQLIRDAEVMFIANEVKPTHATKYTDLNKQLHTVCDKLMETVHNDYFSKEGDGKKKREAAGQ KYFENIISAK PorphyromonasWP_013816155 MTEQNEKPYNGTYYTLEDKHFWAAFFNLARHNAYITLAHIDRQ gingivalisLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF (SEQ IDLEGAAYGKKLFESQSSGNKSSKNKELTKKEKEELQANALSLDN No. 154)LKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGNNDNPFFKHHFVDREGTVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRARFRVPVDILSDEEDTDGAEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFQIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGATRTGRSKYAQDKRFTAEAFLSAHELMPMMFYYFLLREKYSEEASAERVQGRIKRVIEDVYAVYDAFARDEINTRDELDACLADKGIRRGHLPRQMIGILSQEHKDMEEKIRKKLQEMMADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDRVENHRFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGLDEVGSYKEVGFMAKAVPLYFERACKDWVQPFYNYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAKEHPYLDFKSWQKFERELRLVKNQDIITWMICGDLMEENKVEGLDTGTLYLKDIRTDVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRCPHLPDKNFRKMLESWSDPLLDKWPDLHRKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSFPDAIEERMGLNIAHRLSEEVKQAKETVERIIQA Flavobacterium WP_014165541MSSKNESYNKQKTFNHYKQEDKYFFGGFLNNADDNLRQVGKE columnareFKTRINFNHNNNELASVFKDYFNKEKSVAKREHALNLLSNYFP (SEQ IDVLERIQKHTNHNFEQTREIFELLLDTIKKLRDYYTHHYHKPITIN No. 155)PKIYDFLDDTLLDVLITIKKKKVKNDTSRELLKEKLRPELTQLKNQKREELIKKGKKLLEENLENAVFNHCLRPFLEENKTDDKQNKTVSLRKYRKSKPNEETSITLTQSGLVFLMSFFLHRKEFQVFTSGLEGFKAKVNTIKEEEISLNKNNIVYMITHWSYSYYNFKGLKHRIKTDQGVSTLEQNNTTHSLTNTNTKEALLTQIVDYLSKVPNEIYETLSEKQQKEFEEDINEYMRENPENEDSTFSSIVSHKVIRKRYENKFNYFAMRFLDEYAELPTLRFMVNFGDYIKDRQKKILESIQFDSERIIKKEIHLFEKLSLVTEYKKNVYLKETSNIDLSRFPLFPNPSYVMANNNIPFYIDSRSNNLDEYLNQKKKAQSQNKKRNLTFEKYNKEQSKDAIIAMLQKEIGVKDLQQRSTIGLLSCNELPSMLYEVIVKDIKGAELENKIAQKIREQYQSIRDFTLDSPQKDNIPTTLIKTINTDSSVTFENQPIDIPRLKNAIQKELTLTQEKLLNVKEHEIEVDNYNRNKNTYKFKNQPKNKVDDKKLQRKYVFYRNEIRQEANWLASDLIHFMKNKSLWKGYMHNELQSFLAFFEDKKNDCIALLETVFNLKEDCILTKGLKNLFLKHGNFIDFYKEYLKLKEDFLNTESTFLENGLIGLPPKILKKELSKRFKYIFIVFQKRQFIIKELEEKKNNLYADAINLSRGIFDEKPTMIPFKKPNPDEFASWFVASYQYNNYQSFYELTPDIVERDKKKKYKNLRAINKVKIQDYYLKLMVDTLYQDLFNQPLDKSLSDFYVSKAEREKIKADAKAYQKRNDSSLWNKVIHLSLQNNRITANPKLKDIGKYKRALQDEKIATLLTYDDRTWTYALQKPEKENENDYKELHYTALNMELQEYEKVRSKELLKQVQELEKQILEEYTDFLSTQIHPADFEREGNPNFKKYLAHSILENEDDLDKLPEKVEAMRELDETITNPIIKKAIVLIIIRNKMAHNQYPPKFIYDLANRFVPKKEEEYFATYFNRVFETITKELWENKEKKDKTQV Psychroflexus WP_015024765MESIIGLGLSFNPYKTADKHYFGSFLNLVENNLNAVFAEFKERIS torquisYKAKDENISSLIEKHFIDNMSIVDYEKKISILNGYLPIIDFLDDELE (SEQ IDNNLNTRVKNFKKNFIILAEAIEKLRDYYTHFYHDPITFEDNKEPL No. 156)LELLDEVLLKTILDVKKKYLKTDKTKEILKDSLREEMDLLVIRKTDELREKKKTNPKIQHTDSSQIKNSIFNDAFQGLLYEDKGNNKKTQVSHRAKTRLNPKDIHKQEERDFEIPLSTSGLVFLMSLFLSKKEIEDFKSNIKGFKGKVVKDENHNSLKYMATHRVYSILAFKGLKYRIKTDTFSKETLMMQMIDELSKVPDCVYQNLSETKQKDFIEDWNEYFKDNEENTENLENSRVVHPVIRKRYEDKFNYFAIRFLDEFANFKTLKFQVFMGYYIHDQRTKTIGTTNITTERTVKEKINVFGKLSKMDNLKKHFFSQLSDDENTDWEFFPNPSYNFLTQADNSPANNIPIYLELKNQQIIKEKDAIKAEVNQTQNRNPNKPSKRDLLNKILKTYEDFHQGDPTAILSLNEIPALLHLFLVKPNNKTGQQIENIIRIKIEKQFKAINHPSKNNKGIPKSLFADTNVRVNAIKLKKDLEAELDMLNKKHIAFKENQKASSNYDKLLKEHQFTPKNKRPELRKYVFYKSEKGEEATWLANDIKRFMPKDFKTKWKGCQHSELQRKLAFYDRHTKQDIKELLSGCEFDHSLLDINAYFQKDNFEDFFSKYLENRIETLEGVLKKLHDFKNEPTPLKGVFKNCFKFLKRQNYVTESPEIIKKRILAKPTFLPRGVFDERPTMKKGKNPLKDKNEFAEWFVEYLENKDYQKFYNAEEYRMRDADFKKNAVIKKQKLKDFYTLQMVNYLLKEVFGKDEMNLQLSELFQTRQERLKLQGIAKKQMNKETGDSSENTRNQTYIWNKDVPVSFFNGKVTIDKVKLKNIGKYKRYERDERVKTFIGYEVDEKWMMYLPHNWKDRYSVKPINVIDLQIQEYEEIRSHELLKEIQNLEQYIYDHTTDKNILLQDGNPNFKMYVLNGLLIGIKQVNIPDFIVLKQNTNFDKIDFTGIASCSELEKKTIILIAIRNKFAHNQLPNKMIYDLANEFLKIEKNETYANYYLKVLKKMISD LA Riemerella WP_015345620MFFSFHNAQRVIFKHLYKAFDASLRMVKEDYKAHFTVNLTRDF anatipestiferAHLNRKGKNKQDNPDFNRYRFEKDGFFTESGLLFFTNLFLDKR (SEQ IDDAYWMLKKVSGFKASHKQREKMTTEVFCRSRILLPKLRLESRY No. 157)DHNQMLLDMLSELSRCPKLLYEKLSEENKKHFQVEADGFLDEIEEEQNPFKDTLIRHQDRFPYFALRYLDLNESFKSIRFQVDLGTYHYCIYDKKIGDEQEKRHLTRTLLSFGRLQDFTEINRPQEWKALTKDLDYKETSNQPFISKTTPHYHITDNKIGFRLGTSKELYPSLEIKDGANRIAKYPYNSGFVAHAFISVHELLPLMFYQHLTGKSEDLLKETVRHIQRIYKDFEEERINTIEDLEKANQGRLPLGAFPKQMLGLLQNKQPDLSEKAKIKIEKLIAETKLLSHRLNTKLKSSPKLGKRREKLIKTGVLADWLVKDFMRFQPVAYDAQNQPIKSSKANSTEFWFIRRALALYGGEKNRLEGYFKQTNLIGNTNPHPFLNKFNWKACRNLVDFYQQYLEQREKFLEAIKHQPWEPYQYCLLLKVPKENRKNLVKGWEQGGISLPRGLFTEAIRETLSKDLTLSKPIRKEIKKHGRVGFISRAITLYFKEKYQDKHQSFYNLSYKLEAKAPLLKKEEHYEYWQQNKPQSPTESQRLELHTSDRWKDYLLYKRWQHLEKKLRLYRNQDIMLWLMTLELTKNHFKELNLNYHQLKLENLAVNVQEADAKLNPLNQTLPMVLPVKVYPTTAFGEVQYHETPIRTVYIREEQTKALKMGNFKALVKDRRLNGLFSFIKEENDTQKHPISQLRLRRELEIYQSLRVDAFKETLSLEEKLLNKHASLSSLENEFRTLLEEWKKKYAASSMVTDKHIAFIASVRNAFCHNQYPFYKETLHAPILLFTVAQPTTEEKDGLGIAEALLKVLREYCEIVKSQI Prevotella WP_021584635MENDKRLEESACYTLNDKHFWAAFLNLARHNVYITVNHINKTL pleuritidisELKNKKNQEIIIDNDQDILAIKTHWAKVNGDLNKTDRLRELMIK (SEQ IDHFPFLEAAIYSNNKEDKEEVKEEKQAKAQSFKSLKDCLFLFLEK No. 158)LQEARNYYSHYKYSESSKEPEFEEGLLEKMYNTFDASIRLVKEDYQYNKDIDPEKDFKHLERKEDFNYLFTDKDNKGKITKNGLLFFVSLFLEKKDAIWMQQKFRGFKDNRGNKEKMTHEVFCRSRMLLPKIRLESTQTQDWILLDMLNELIRCPKSLYERLQGAYREKFKVPFDSIDEDYDAEQEPFRNTLVRHQDRFPYFALRYFDYNEIFKNLRFQIDLGTYHFSIYKKLIGGKKEDRHLTHKLYGFERIQEFTKQNRPDKWQAIIKDLDTYETSNERYISETTPHYHLENQKIGIRFRNDNNDIWPSLKTNGEKNEKSKYNLDKPYQAEAFLSVHELLPMMFYYLLLKMENTDNDKEDNEVGTKKKGNKNNKQEKHKIEEIIENKIKDIYALYDAFTNGEINSIDELAEQREGKDIEIGHLPKQLIVILKNKSKDMAEKANRKQKEMIKDTKKRLATLDKQVKGEIEDGGRNIRLLKSGEIARWLVNDMMRFQPVQKDNEGKPLNNSKANSTEYQMLQRSLALYNKEEKPTRYFRQVNLIKSSNPHPFLEDTKWEECYNILSFYRNYLKAKIKFLNKLKPEDWKKNQYFLMLKEPKTNRKTLVQGWKNGFNLPRGIFTEPIKEWFKRHQNDSEEYKKVEALDRVGLVAKVIPLFFKEEYFKEDAQKEINNCVQPFYSFPYNVGNIHKPEEKNFLHCEERRKLWDKKKDKFKGYKAKEKSKKMTDKEKEEHRSYLEFQSWNKFERELRLVRNQDILTWLLCTKLIDKLKIDELNIEELQKLRLKDIDTDTAKKEKNNILNRVMPMRLPVTVYEIDKSFNIVKDKPLHTVYIEETGTKLLKQGNFKALVKDRRLNGLFSFVKTSSEAESKSKPISKLRVEYELGAYQKARIDIIKDMLALEKTLIDNDENLPTNKFSDMLKSWLKGKGEANKARLQNDVGLLVAVRNAFSHNQYPMYNSEVFKGMKLLSLSSDIPEKEGLGIAKQLKDKIKETIERIIEIEKE IRN PorphyromonasWP_021663197 MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 159)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLAKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPRSMGFISVHDLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMDQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLQKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRHQFRAIVAELRLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL Porphyromonas WP_021665475MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 160)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLAKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGFKRTNENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPQSMGFISVHDLRKLLLMELLCEGSFSRMQSGFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMNQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELHLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDKENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDHENRFFGKLLNNMSQPINDL Porphyromonas WP_021677657MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 161)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLAKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPQSMGFISVHDLRKLLLMELLCEGSFSRMQSGFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMNQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELHLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDHENRFFGKLLNNMSQPINDL Porphyromonas WP_021680012MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 162)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLAKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPRSMGFISVHDLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMDQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLQKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRHQFRAIVAELRLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKVMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVRDKKRELRTAGKPVPPDLAAYIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKIMTDREEDILPGLKNIDSILDKENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEIPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL Porphyromonas WP_023846767MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 163)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLAKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPRSMGFISVHDLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMNQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELHLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL Prevotella WP_036884929MKNDNNSTKSTDYTLGDKHFWAAFLNLARHNVYITVNHINKV falseniiLELKNKKDQEIIIDNDQDILAIKTLWGKVDTDINKKDRLRELIM (SEQ IDKHFPFLEAATYQQSSTNNTKQKEEEQAKAQSFESLKDCLFLFLE No. 164)KLREARNYYSHYKHSKSLEEPKLEEKLLENMYNIFDTNVQLVIKDYEHNKDINPEEDFKHLGRAEGEFNYYFTRNKKGNITESGLLFFVSLFLEKKDAIWAQTKIKGFKDNRENKQKMTHEVFCRSRMLLPKLRLESTQTQDWILLDMLNELIRCPKSLYKRLQGEKREKFRVPFDPADEDYDAEQEPFKNTLVRHQDRFPYFALRYFDYNEIFTNLRFQIDLGTYHFSIYKKQIGDKKEDRHLTHKLYGFERIQEFAKENRPDEWKALVKDLDTFEESNEPYISETTPHYHLENQKIGIRNKNKKKKKTIWPSLETKTTVNERSKYNLGKSFKAEAFLSVHELLPMMFYYLLLNKEEPNNGKINASKVEGIIEKKIRDIYKLYGAFANEEINNEEELKEYCEGKDIAIRHLPKQMIAILKNEYKDMAKKAEDKQKKMIKDTKKRLAALDKQVKGEVEDGGRNIKPLKSGRIASWLVNDMMRFQPVQRDRDGYPLNNSKANSTEYQLLQRTLALFGSERERLAPYFRQMNLIGKDNPHPFLKDTKWKEHNNILSFYRSYLEAKKNFLGSLKPEDWKKNQYFLKLKEPKTNRETLVQGWKNGFNLPRGIFTEPIREWFIRHQNESEEYKKVKDFDRIGLVAKVIPLFFKEDYQKEIEDYVQPFYGYPFNVGNIHNSQEGTFLNKKEREELWKGNKTKFKDYKTKEKNKEKTNKDKFKKKTDEEKEEFRSYLDFQSWKKFERELRLVRNQDIVTWLLCMELIDKLKIDELNIEELQKLRLKDIDTDTAKKEKNNILNRIMPMELPVTVYETDDSNNIIKDKPLHTIYIKEAETKLLKQGNFKALVKDRRLNGLFSFVETSSEAELKSKPISKSLVEYELGEYQRARVEIIKDMLRLEETLIGNDEKLPTNKFRQMLDKWLEHKKETDDTDLKNDVKLLTEVRNAFSHNQYPMRDRIAFANIKPFSLSSANTSNEEGLGIAKKLKDKTKETIDRIIEIEEQTATKR Prevotella WP_036931485MENDKRLEESTCYTLNDKHFWAAFLNLARHNVYITINHINKLL pleuritidisEIRQIDNDEKVLDIKALWQKVDKDINQKARLRELMIKHFPFLEA (SEQ IDAIYSNNKEDKEEVKEEKQAKAQSFKSLKDCLFLFLEKLQEARN No. 165)YYSHYKSSESSKEPEFEEGLLEKMYNTFGVSIRLVKEDYQYNKDIDPEKDFKHLERKEDFNYLFTDKDNKGKITKNGLLFFVSLFLEKKDAIWMQQKLRGFKDNRGNKEKMTHEVFCRSRMLLPKIRLESTQTQDWILLDMLNELIRCPKSLYERLQGAYREKFKVPFDSIDEDYDAEQEPFRNTLVRHQDRFPYFALRYFDYNEIFKNLRFQIDLGTYHFSIYKKLIGDNKEDRHLTHKLYGFERIQEFAKQKRPNEWQALVKDLDIYETSNEQYISETTPHYHLENQKIGIRFKNKKDKIWPSLETNGKENEKSKYNLDKSFQAEAFLSIHELLPMMFYDLLLKKEEPNNDEKNASIVEGFIKKEIKRMYAIYDAFANEEINSKEGLEEYCKNKGFQERHLPKQMIAILTNKSKNMAEKAKRKQKEMIKDTKKRLATLDKQVKGEIEDGGRNIRLLKSGEIARWLVNDMMRFQSVQKDKEGKPLNNSKANSTEYQMLQRSLALYNKEQKPTPYFIQVNLIKSSNPHPFLEETKWEECNNILSFYRSYLEAKKNFLESLKPEDWKKNQYFLMLKEPKTNRKTLVQGWKNGFNLPRGIFTEPIKEWFKRHQNDSEEYKKVEALDRVGLVAKVIPLFFKEEYFKEDAQKEINNCVQPFYSFPYNVGNIHKPEEKNFLHCEERRKLWDKKKDKFKGYKAKEKSKKMTDKEKEEHRSYLEFQSWNKFERELRLVRNQDIVTWLLCTELIDKLKIDELNIEELQKLRLKDIDTDTAKKEKNNILNRIMPMQLPVTVYEIDKSFNIVKDKPLHTIYIEETGTKLLKQGNFKALVKDRRLNGLFSFVKTSSEAESKSKPISKLRVEYELGAYQKARIDIIKDMLALEKTLIDNDENLPTNKFSDMLKSWLKGKGEANKARLQNDVDLLVAIRNAFSHNQYPMYNSEVFKGMKLLSLSSDIPEKEGLGIAKQLKDKIKETIERIIEIEKEIRN [Porphyromonas WP_039417390MTEQNERPYNGTYYTLEDKHFWAAFFNLARHNAYITLAHIDRQ gingivalisLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF (SEQ IDLEGAAYGKKLFESQSSGNKSSKKKELTKKEKEELQANALSLDN No. 166)LKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGNNDNPFFKHHFVDREGTVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTEAYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRARFRVPIDILSDEDDTDGTEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGATRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAVYDAFARGEIDTLDRLDACLADKGIRRGHLPRQMIAILSQEHKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDREENHRFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAKEHPYLDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTDVHEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDKNFRKMLESWSDPLLDKWPDLHRKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKEMAERIIQV Porphyromonas WP_094189123MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDQDVLSFKALWKNLDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 167)LKSILFDFLQKLKDFRNYYSHYRHSGSSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGHNDNPSFKHHFVDSEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRMDDWMLLDMLNELVRCPKPLYDRLREDDRACFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTSPHYHIEKGKIGLRFMPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAIYDAFARDEINTLKELDACLADKGIRRGHLPKQMIAILSQEHKNMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDASGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHDTRWESHTNILSFYRSYLRARKAFLERIGRSDRMENRPFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYREVGFMAKAVPLYFERACEDRVQPFYDSPFNVGNSLKPKKGRFLSKEERAEEWERGKERFRDLEAWSHSAARRIEDAFAGIEYASPGNKKKIEQLLRDLSLWEAFESKLKVRADKINLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTNVQEQGSLNVLNHVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDKNFRKMLESWSDPLLAKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAK ETVERIIQA PorphyromonasWP_039419792 MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDQDVLSFKALWKNLDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 168)LKSILFDFLQKLKDFRNYYSHYRHSGSSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGHNDNPSFKHHFVDGEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKPLYDRLREKDRARFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKVIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAIYDAFARDEINTRDELDACLADKGIRRGHLPKQMIGILSQEHKNMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLDETRWESHTNILSFYRSYLRARKAFLERIGRSDRVENRPFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDSPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRPNVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCVFELTLRLEESLLSRYPHLPDESFREMLESWSDPLLAKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKETVERIIQA Porphyromonas WP_039426176MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDQDVLSFKALWKNFDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 169)LKSILFDFLQKLKDFRNYYSHYRHSGSSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHYHFNHLVRKGKKDRYGHNDNPSFKHHFVDSEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTGPYEQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKPLYDRLREKDRACFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTTPHYHIEKGKIGLRFMPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIKDVYAIYDAFARDEINTLKELDACSADKGIRRGHLPKQMIGILSQEHKNMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLDETRWESHTNILSFYRSYLRARKAFLERIGRSDRVENRPFLLLKEPKNDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDSPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAKEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTDVHEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDENFREMLESWSDPLLGKWPDLHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKETVERIIQA Porphyromonas WP_039431778MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDQDVLSFKALWKNFDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 170)LKSILFDFLQKLKDFRNYYSHYRHSESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGHNDNPSFKHHFVDGEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKPLYDRLREDDRACFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTSPHYHIEKGKIGLRFMPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAIYDAFARDEINTLKELDACLADKGIRRGHLPKQMIAILSQEHKDMEEKIRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKKRLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLRARKAFLERIGRSDRMENRPFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYREVGFMAKAVPLYFERACEDRVQPFYDSPFNVGNSLKPKKGRFLSKEERAEEWERGKERFRDLEAWSHSAARRIEDAFAGIEYASPGNKKKIEQLLRDLSLWEAFESKLKVRADKINLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRPNVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCVFELTLRLEESLLTRYPHLPDESFRKMLESWSDPLLAKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKE TVERIIQV PorphyromonasWP_039437199 MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDEDILFFKGQWKNLDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKFFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 171)LKSILFDFLQKLKDFRNYYSHYRHSGSSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDEVDPHYHFNHLVRKGKKDRYGHNDNPSFKHHFVDGEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTEPYEQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKPLYDRLREKDRACFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAIYDAFARDEINTLKELDACLADKGIRRGHLPKQMIGILSQERKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLRARKAFLERIGRSDRVENCPFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGYDEVGSYREVGFMAKAVPLYFERACEDRVQPFYDSPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRPNVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDESFREMLESWSDPLLTKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIWKYDPSSPDAIEERMGLNIAHRLSEEVKQAKETIERIIQA Porphyromonas WP_039442171MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDQDVLSFKALWKNLDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 172)LKSILFDFLQKLKDFRNYYSHYRHSGSSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHYHFNHLVRKGKKDRYGHNDNPSFKHHFVDSEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTGPYEQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKPLYDRLREKDRACFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYLETGDKPYISQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGTTRTGRSKCAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAIYDAFARDEINTLKELDTCLADKGIRRGHLPKQMITILSQERKDMKEKIRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDASGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLRARKAFLERIGRSDRVENCPFLLLKEPKTDRQTLVAGWKDEFHLPRGIFTEAVRDCLIEMGYDEVGSYREVGFMAKAVPLYFERACEDRVQPFYDSPFNVGNSLKPKKGRFLSKEDRAEEWERGMERFRDLEAWSHSAARRIKDAFAGIEYASPGNKKKIEQLLRDLSLWEAFESKLKVRADKINLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRPNVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEAPLATVYIEERNTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCVFELTLRLEESLLSRYPHLPDESFREMLESWSDPLLAKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKE TVERIIQA PorphyromonasWP_039445055 MNTVPATENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRI gulaeKFGKKKLNEESLKQSLLCDHLLSIDRWTKVYGHSRRYLPFLHCF (SEQ IDDPDSGIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 173)DGTTFEHLKVSPDISSFITGAYTFACERAQSRFADFFKPDDFLLAKNRKEQLISVADGKECLTVSGFAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPQSMGFISVHDLRKLLLMELLCEGSFSRMQSDFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMNQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELHLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVRDKKRELRTAGKPVPPDLAAYIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDEENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDHENRFFGKLLNNMSQPINDL Capnocytophaga WP_041989581MENKTSLGNNIYYNPFKPQDKSYFAGYLNAAMENIDSVFRELG cynodegmiKRLKGKEYTSENFFDAIFKENISLVEYERYVKLLSDYFPMARLL (SEQ IDDKKEVPIKERKENFKKNFRGIIKAVRDLRNFYTHKEHGEVEITD No. 174)EIFGVLDEMLKSTVLTVKKKKIKTDKTKEILKKSIEKQLDILCQKKLEYLKDTARKIEEKRRNQRERGEKKLVPRFEYSDRRDDLIAAIYNDAFDVYIDKKKDSLKESSKTKYNTESYPQQEEGDLKIPISKNGVVFLLSLFLSKQEVHAFKSKIAGFKATVIDEATVSHRKNSICFMATHEIFSHLAYKKLKRKVRTAEINYSEAENAEQLSIYAKETLMMQMLDELSKVPDVVYQNLSEDVQKTFIEDWNEYLKENNGDVGTMEEEQVIHPVIRKRYEDKFNYFAIRFLDEFAQFPTLRFQVHLGNYLHDSRPKEHLISDRRIKEKITVFGRLSELEHKKALFIKNTETNEDRKHYWEVFPNPNYDFPKENISVNDKDFPIAGSILDREKQPTAGKIGIKVNLLNQKYISEVDKAVKAHQLKQRNNKPSIQNIIEEIVPINGSNPKEIIVFGGQPTAYLSMNDIHSILYEFFDKWEKKKEKLEKKGEKELRKEIGKELEEKIVGKIQTQIQQIIDKDINAKILKPYQDDDSTAIDKEKLIKDLKQEQKILQKLKNEQTAREKEYQECIAYQEESRKIKRSDKSRQKYLRNQLKRKYPEVPTRKEILYYQEKGKVAVWLANDIKRFMPTDFKNEWKGEQHSLLQKSLAYYEQCKEELKNLLPQQKVFKHLPFELGGHFQQKYLYQFYTRYLDKRLEHISGLVQQAENFKNENKVFKKVENECFKFLKKQNYTHKGLDAQAQSVLGYPIFLERGFMDEKPTIIKGKTFKGNESLFTDWFRYYKEYQNFQTFYDTENYPLVELEKKQADRKRETKIYQQKKNDVFTLLMAKHIFKSVFKQDSIDRFSLEDLYQSREERLENQEKAKQTGERNTNYIWNKTVDLNLCDGKVTVENVKLKNVGNFIKYEYDQRVQTFLKYEENIKWQAFLIKESKEEENYPYIVEREIEQYEKVRREELLKEVHLIEEYILEKVKDKEILKKGDNQNFKYYILNGLLKQLKNEDVESYKVFNLNTKPEDVNINQLKQEATDLEQKAFVLTYIRNKFAHNQLPKKEFWDYCQEKYGKIEKEKTYAEYFAEVFKREKEALMK Prevotella WP_042518169MNIPALVENQKKYFGTYSVMAMLNAQTVLDHIQKVADIEGEQ sp. P5-119NENNENLWFHPVMSHLYNAKNGYDKQPEKTMFIIERLQSYFPF (SEQ IDLKIMAENQREYSNGKYKQNRVEVNSNDIFEVLKRAFGVLKMY No. 175)RDLTNHYKTYEEKLIDGCEFLTSTEQPLSGMISKYYTVALRNTKERYGYKTEDLAFIQDNIKKITKDAYGKRKSQVNTGFFLSLQDYNGDTQKKLHLSGVGIALLICLFLDKQYINIFLSRLPIFSSYNAQSEERRIIIRSFGINSIKLPKDRIHSEKSNKSVAMDMLNEVKRCPDELFTTLSAEKQSRFRIISDDHNEVLMKRSTDRFVPLLLQYIDYGKLFDHIRFHVNMGKLRYLLKADKTCIDGQTRVRVIEQPLNGFGRLEEAETMRKQENGTFGNSGIRIRDFENVKRDDANPANYPYIVDTYTHYILENNKVEMFISDKGSSAPLLPLIEDDRYVVKTIPSCRMSTLEIPAMAFHMFLFGSKKTEKLIVDVHNRYKRLFQAMQKEEVTAENIASFGIAESDLPQKILDLISGNAHGKDVDAFIRLTVDDMLTDTERRIKRFKDDRKSIRSADNKMGKRGFKQISTGKLADFLAKDIVLFQPSVNDGENKITGLNYRIMQSAIAVYDSGDDYEAKQQFKLMFEKARLIGKGTTEPHPFLYKVFARSIPANAVDFYERYLIERKFYLTGLCNEIKRGNRVDVPFIRRDQNKWKTPAMKTLGRIYSEDLPVELPRQMFDNEIKSHLKSLPQMEGIDFNNANVTYLIAEYMKRVLNDDFQTFYQWKRNYHYMDMLKGEYDRKGSLQHCFTSVEEREGLWKERASRTERYRKLASNKIRSNRQMRNASSEEIETILDKRLSNCRNEYQKSEKVIRRYRVQDALLFLLAKKTLTELADFDGERFKLKEIMPDAEKGILSEIMPMSFTFEKGGKKYTITSEGMKLKNYGDFFVLASDKRIGNLLELVGSDIVSKEDIMEEFNKYDQCRPEISSIVFNLEKWAFDTYPELSARVDREEKVDFKSILKILLNNKNINKEQSDILRKIRNAFDHNNYPDKGIVEIKALPEIAMSIKKAFGEYAIMK Prevotella WP_044072147MNIPALVENQKKYFGTYSVMAMLNAQTVLDHIQKVADIEGEQ sp. P4-76NENNENLWFHPVMSHLYNAKNGYDKQPEKTMFIIERLQSYFPF (SEQ IDLKIMAENQREYSNGKYKQNRVEVNSNDIFEVLKRAFGVLKMY No. 176)RDQASHYKTYDEKLIDGCEFLTSTEQPLSGMINNYYTVALRNMNERYGYKTEDLAFIQDKRFKFVKDAYGKKKSQVNTGFFLSLQDYNGDTQKKLHLSGVGIALLICLFLDKQYINIFLSRLPIFSSYNAQSEERRIIIRSFGINSIKQPKDRIHSEKSNKSVAMDMLNEIKRCPNELFETLSAEKQSRFRIISNDHNEVLMKRSSDRFVPLLLQYIDYGKLFDHIRFHVNMGKLRYLLKADKTCIDGQTRVRVIEQPLNGFGRLEEVETMRKQENGTFGNSGIRIRDFENMKRDDANPANYPYIVDTYTHYILENNKVEMFISDEETPAPLLPVIEDDRYVVKTIPSCRMSTLEIPAMAFHMFLFGSKKTEKLIVDVHNRYKRLFKAMQKEEVTAENIASFGIAESDLPQKIIDLISGNAHGKDVDAFIRLTVDDMLADTERRIKRFKDDRKSIRSADNKMGKRGFKQISTGKLADFLAKDIVLFQPSVNDGENKITGLNYRIMQSAIAVYNSGDDYEAKQQFKLMFEKARLIGKGTTEPHPFLYKVFVRSIPANAVDFYERYLIERKFYLIGLSNEIKKGNRVDVPFIRRDQNKWKTPAMKTLGRIYDEDLPVELPRQMFDNEIKSHLKSLPQMEGIDFNNANVTYLIAEYMKRVLNDDFQTFYQWKRNYRYMDMLRGEYDRKGSLQSCFTSVEEREGLWKERASRTERYRKLASNKIRSNRQMRNASSEEIETILDKRLSNSRNEYQKSEKVIRRYRVQDALLFLLAKKTLTELADFDGERFKLKEIMPDAEKGILSEIMPMSFTFEKGGKKYTITSEGMKLKNYGDFFVLASDKRIGNLLELVGSDTVSKEDIMEEFKKYDQCRPEISSIVFNLEKWAFDTYPELSARVDREEKVDFKSILKILLNNKNINKEQSDILRKIRNAFDHNNYPDKGVVEIRALPEIAMSIKKAFGEYAIMK Prevotella WP_044074780MNIPALVENQKKYFGTYSVMAMLNAQTVLDHIQKVADIEGEQ sp. P5-60NENNENLWFHPVMSHLYNAKNGYDKQPEKTMFIIERLQSYFPF (SEQ IDLKIMAENQREYSNGKYKQNRVEVNSNDIFEVLKRAFGVLKMY No. 177)RDLTNHYKTYEEKLIDGCEFLTSTEQPFSGMISKYYTVALRNTKERYGYKAEDLAFIQDNRYKFTKDAYGKRKSQVNTGSFLSLQDYNGDTTKKLHLSGVGIALLICLFLDKQYINLFLSRLPIFSSYNAQSEERRIIIRSFGINSIKQPKDRIHSEKSNKSVAMDMLNEVKRCPDELFTTLSAEKQSRFRIISDDHNEVLMKRSSDRFVPLLLQYIDYGKLFDHIRFHVNMGKLRYLLKADKTCIDGQTRVRVIEQPLNGFGRLEEVETMRKQENGTFGNSGIRIRDFENMKRDDANPANYPYIVETYTHYILENNKVEMFISDEENPTPLLPVIEDDRYVVKTIPSCRMSTLEIPAMAFHMFLFGSEKTEKLIIDVHDRYKRLFQAMQKEEVTAENIASFGIAESDLPQKIMDLISGNAHGKDVDAFIRLTVDDMLTDTERRIKRFKDDRKSIRSADNKMGKRGFKQISTGKLADFLAKDIVLFQPSVNDGENKITGLNYRIMQSAIAVYDSGDDYEAKQQFKLMFEKARLIGKGTTEPHPFLYKVFVRSIPANAVDFYERYLIERKFYLIGLSNEIKKGNRVDVPFIRRDQNKWKTPAMKTLGRIYSEDLPVELPRQMFDNEIKSHLKSLPQMEGIDFNNANVTYLIAEYMKRVLNDDFQTFYQWKRNYRYMDMLRGEYDRKGSLQHCFTSIEEREGLWKERASRTERYRKLASNKIRSNRQMRNASSEEIETILDKRLSNCRNEYQKSEKIIRRYRVQDALLFLLAKKTLTELADFDGERFKLKEIMPDAEKGILSEIMPMSFTFEKGGKIYTITSGGMKLKNYGDFFVLASDKRIGNLLELVGSNTVSKEDIMEEFKKYDQCRPEISSIVFNLEKWAFDTYPELPARVDRKEKVDFWSILDVLSNNKDINNEQSYILRKIRNAFDHNNYPDKGIVEIKALPEIAMSIKKAFGEYAIMK Phaeodactylibacter WP_044218239MTNTPKRRTLHRHPSYFGAFLNIARHNAFMIMEHLSTKYDMED xiamenensisKNTLDEAQLPNAKLFGCLKKRYGKPDVTEGVSRDLRRYFPFLN (SEQ IDYPLFLHLEKQQNAEQAATYDINPEDIEFTLKGFFRLLNQMRNNY No. 178)SHYISNTDYGKFDKLPVQDIYEAAIFRLLDRGKHTKRFDVFESKHTRHLESNNSEYRPRSLANSPDHENTVAFVTCLFLERKYAFPFLSRLDCFRSTNDAAEGDPLIRKASHECYTMFCCRLPQPKLESSDILLDMVNELGRCPSALYNLLSEEDQARFHIKREEITGFEEDPDEELEQEIVLKRHSDRFPYFALRYFDDTEAFQTLRFDVYLGRWRTKPVYKKRIYGQERDRVLTQSIRTFTRLSRLLPIYENVKHDAVRQNEEDGKLVNPDVTSQFHKSWIQIESDDRAFLSDRIEHFSPHYNFGDQVIGLKFINPDRYAAIQNVFPKLPGEEKKDKDAKLVNETADAIISTHEIRSLFLYHYLSKKPISAGDERRFIQVDTETFIKQYIDTIKLFFEDIKSGELQPIADPPNYQKNEPLPYVRGDKEKTQEERAQYRERQKEIKERRKELNTLLQNRYGLSIQYIPSRLREYLLGYKKVPYEKLALQKLRAQRKEVKKRIKDIEKMRTPRVGEQATWLAEDIVFLTPPKMHTPERKTTKHPQKLNNDQFRIMQSSLAYFSVNKKAIKKFFQKETGIGLSNRETSHPFLYRIDVGRCRGILDFYTGYLKYKMDWLDDAIKKVDNRKHGKKEAKKYEKYLPSSIQHKTPLELDYTRLPVYLPRGLFKKAIVKALAAHADFQVEPEEDNVIFCLDQLLDGDTQDFYNWQRYYRSALTEKETDNQLVLAHPYAEQILGTIKTLEGKQKNNKLGNKAKQKIKDELIDLKRAKRRLLDREQYLRAVQAEDRALWLMIQERQKQKAEHEEIAFDQLDLKNITKILTESIDARLRIPDTKVDITDKLPLRRYGDLRRVAKDRRLVNLASYYHVAGLSEIPYDLVKKELEEYDRRRVAFFEHVYQFEKEVYDRYAAELRNENPKGESTYFSHWEYVAVAVKHSADTHFNELFKEKVMQLRNKFHHNEFPYFDWLLPEVEKASAALYADRVFDVAEGYYQKMRKLMRQ Flayobacterium WP_045968377MDNNITVEKTELGLGITYNHDKVEDKHYFGGFFNLAQNNIDLV sp. 316AQEFKKRLLIQGKDSINIFANYFSDQCSITNLERGIKILAEYFPVV (SEQ IDSYIDLDEKNKSKSIREHLILLLETINNLRNYYTHYYHKKIIIDGSL No. 179)FPLLDTILLKVVLEIKKKKLKEDKTKQLLKKGLEKEMTILFNLMKAEQKEKKIKGWNIDENIKGAVLNRAFSHLLYNDELSDYRKSKYNTEDETLKDTLTESGILFLLSFFLNKKEQEQLKANIKGYKGKIASIPDEEITLKNNSLRNMATHWTYSHLTYKGLKHRIKTDHEKETLLVNMVDYLSKVPHEIYQNLSEQNKSLFLEDINEYMRDNEENHDSSEASRVIHPVIRKRYENKFAYFAIRFLDEFAEFPTLRFMVNVGNYIHDNRKKDIGGTSLITNRTIKQQINVFGNLTEIHKKKNDYFEKEENKEKTLEWELFPNPSYHFQKENIPIFIDLEKSKETNDLAKEYAKEKKKIFGSSRKKQQNTAKKNRETIINLVFDKYKTSDRKTVTFEQPTALLSFNELNSFLYAFLVENKTGKELEKIIIEKIANQYQILKNCSSTVDKTNDNIPKSIKKIVNTTTDSFYFEGKKIDIEKLEKDITIEIEKTNEKLETIKENEESAQNYKRNERNTQKRKLYRKYVFFTNEIGIEATWITNDILRFLDNKENWKGYQHSELQKFISQYDNYKKEALGLLESEWNLESDAFFGQNLKRMFQSNSTFETFYKKYLDNRKNTLETYLSAIENLKTMTDVRPKVLKKKWTELFRFFDKKIYLLSTIETKINELITKPINLSRGIFEEKPTFINGKNPNKENNQHLFANWFIYAKKQTILQDFYNLPLEQPKAITNLKKHKYKLERSINNLKIEDIYIKQMVDFLYQKLFEQSFIGSLQDLYTSKEKREIEKGKAKNEQTPDESFIWKKQVEINTHNGRIIAKTKIKDIGKFKNLLTDNKIAHLISYDDRIWDFSLNNDGDITKKLYSINTELESYETIRREKLLKQIQQFEQFLLEQETEYSAERKHPEKFEKDCNPNFKKYIIEGVLNKIIPNHEIEEIEILKSKEDVFKINFSDILILNNDNIKKGYLLIMIRNKFAHNQLIDKNLFNFSLQLYSKNENENFSEYLNKVCQNIIQEFKEKLK Porphyromonas WP_046201018MTEQSERPYNGTYYTLEDKHFWAAFLNLARHNAYITLTHIDRQ gulaeLAYSKADITNDQDVLSFKALWKNFDNDLERKSRLRSLILKHFSF (SEQ IDLEGAAYGKKLFESKSSGNKSSKNKELTKKEKEELQANALSLDN No. 180)LKSILFDFLQKLKDFRNYYSHYRHSESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRYGHNDNPSFKHHFVDSEGMVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKPLYDRLREKDRARFRVPVDILPDEDDTDGGGEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKMIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYISQTTPHYHIEKGKIGLRFMPEGQHLWPSPEVGTTRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAIYDAFARDEINTLKELDACLADKGIRRGHLPKQMIAILSQEHKDMEEKIRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKKRLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLRARKAFLERIGRSDRMENRPFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYREVGFMAKAVPLYFERACEDRVQPFYDSPFNVGNSLKPKKGRFLSKEERAEEWERGKERFRDLEAWSHSAARRIEDAFAGIEYASPGNKKKIEQLLRDLSLWEAFESKLKVRADKINLAKLKKEILEAQEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRPNVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCVFELTLRLEESLLTRYPHLPDESFRKMLESWSDPLLAKWPELHGKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKE TVERIIQV WP_047431796Chryseobacterium METQTIGHGIAYDHSKIQDKHFFGGFLNLAENNIKAVLKAFSEK (SEQ sp.FNVGNVDVKQFADVSLKDNLPDNDFQKRVSFLKMYFPVVDFIN ID No. YR477IPNNRAKFRSDLTTLFKSVDQLRNFYTHYYHKPLDFDASLFILLD 181)DIFARTAKEVRDQKMKDDKTRQLLSKSLSEELQKGYELQLERLKELNRLGKKVNIHDQLGIKNGVLNNAFNHLIYKDGESFKTKLTYSSALTSFESAENGIEISQSGLLFLLSMFLKRKEIEDLKNRNKGFKAKVVIDEDGKVNGLKFMATHWVFSYLCFKGLKSKLSTEFHEETLLIQIIDELSKVPDELYCAFDKETRDKFIEDINEYVKEGHQDFSLEDAKVIHPVIRKRYENKFNYFAIRFLDEFVKFPSLRFQVHVGNYVHDRRIKNIDGTTFETERVVKDRIKVFGRLSEISSYKAQYLSSVSDKHDETGWEIFPNPSYVFINNNIPIHISVDTSFKKEIADFKKLRRAQVPDELKIRGAEKKRKFEITQMIGSKSVLNQEEPIALLSLNEIPALLYEILINGKEPAEIERIIKDKLNERQDVIKNYNPENWLPASQISRRLRSNKGERIINTDKLLQLVTKELLVTEQKLKIISDNREALKQKKEGKYIRKFIFTNSELGREAIWLADDIKRFMPADVRKEWKGYQHSQLQQSLAFYNSRPKEALAILESSWNLKDEKIIWNEWILKSFTQNKFFDAFYNEYLKGRKKYFAFLSEHIVQYTSNAKNLQKFIKQQMPKDLFEKRHYIIEDLQTEKNKILSKPFIFPRGIFDKKPTFIKGVKVEDSPESFANWYQYGYQKDHQFQKFYDWKRDYSDVFLEHLGKPFINNGDRRTLGMEELKERIIIKQDLKIKKIKIQDLFLRLIAENLFQKVFKYSAKLPLSDFYLTQEERMEKENMAALQNVREEGDKSPNIIKDNFIWSKMIPYKKGQIIENAVKLKDIGKLNVLSLDDKVQTLLSYDDAKPWSKIALENEFSIGENSYEVIRREKLFKEIQQFESEILFRSGWDGINHPAQLEDNRNPKFKMYIVNGILRKSAGLYSQGEDIWFEYNADFNNLDADVLETKSELVQLAFLVTAIRNKFAHNQLPAKEFYFYIRAKYGFADEPSVALVYLNFTKYAINEFKKVMI Riemerella WP_049354263MFFSFHNAQRVIFKHLYKAFDASLRMVKEDYKAHFTVNLTRDF anatipestiferAHLNRKGKNKQDNPDFNRYRFEKDGFFTESGLLFFTNLFLDKR (SEQ IDDAYWMLKKVSGFKASHKQREKMTTEVFCRSRILLPKLRLESRY No. 182)DHNQMLLDMLSELSRCPKLLYEKLSEENKKHFQVEADGFLDEIEEEQNPFKDTLIRHQDRFPYFALRYLDLNESFKSIRFQVDLGTYHYCIYDKKIGDEQEKRHLTRTLLSFGRLQDFTEINRPQEWKALTKDLDYKETSNQPFISKTTPHYHITDNKIGFRLGTSKELYPSLEIKDGANRIAKYPYNSGFVAHAFISVHELLPLMFYQHLTGKSEDLLKETVRHIQRIYKDFEEERINTIEDLEKANQGRLPLGAFPKQMLGLLQNKQPDLSEKAKIKIEKLIAETKLLSHRLNTKLKSSPKLGKRREKLIKTGVLADWLVKDFMRFQPVAYDAQNQPIKSSKANSTEFWFIRRALALYGGEKNRLEGYFKQTNLIGNTNPHPFLNKFNWKACRNLVDFYQQYLEQREKFLEAIKNQPWEPYQYCLLLKIPKENRKNLVKGWEQGGISLPRGLFTEAIRETLSEDLMLSKPIRKEIKKHGRVGFISRAITLYFKEKYQDKHQSFYNLSYKLEAKAPLLKREEHYEYWQQNKPQSPTESQRLELHTSDRWKDYLLYKRWQHLEKKLRLYRNQDVMLWLMTLELTKNHFKELNLNYHQLKLENLAVNVQEADAKLNPLNQTLPMVLPVKVYPATAFGEVQYHKTPIRTVYIREEHTKALKMGNFKALVKDRRLNGLFSFIKEENDTQKHPISQLRLRRELEIYQSLRVDAFKETLSLEEKLLNKHTSLSSLENEFRALLEEWKKEYAASSMVTDEHIAFIASVRNAFCHNQYPFYKEALHAPIPLFTVAQPTTEEKDGLGIAEALLKVLREYCEIVKSQI Porphyromonas WP_052912312MTEQNEKPYNGTYYTLEDKHFWAAFFNLARHNAYITLAHIDRQ gingivalisLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF (SEQ IDLEGAAYGKKLFESQSSGNKSSKKKELTKKEKEELQANALSLDN No. 183)LKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDKYGNNDNPFFKHHFVDREEKVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTEAYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKLLYDRLREEDRARFRVPVDILSDEDDTDGTEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQLLWPSPEVGATRTGRSKYAQDKRFTAEAFLSVHELMPMMFYYFLLREKYSEEASAEKVQGRIKRVIEDVYAVYDAFARDEINTRDELDACLADKGIRRGHLPRQMIAILSQEHKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDREENHRFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRDLEAWSHSAARRIEDAFVGIEYASWENKKKIEQLLQDLSLWETFESKLKVKADKINIAKLKKEILEAKEHPYHDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTDVQEQGSLNVLNHVKPMRLPVVVYRADSRGHVHKEEAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDESFREMLESWSDPLLDKWPDLQREVRLLIAVRNAFSHNQYPMYDETIFSSIRKYDPSSLDAIEERMGLNIAHRLSEEVKLAKEMV ERIIQA PorphyromonasWP_058019250 MTEQNEKPYNGTYYTLKDKHFWAAFFNLARHNAYITLTHIDRQ gingivalisLAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF (SEQ IDLEGAAYGKKLFESQSSGNKSSKKKELTKKEKEELQANALSLDN No. 184)LKSILFDFLQKLKDFRNYYSHYRHPESSELPMFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRCGNNDNPFFKHHFVDREGKVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTETYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRACFRVPVDILSDEDDTDGAEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDCFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGATRTGRSKYAQDKRFTAEAFLSVHELMPMMFYYFLLREKYSEEVSAERVQGRIKRVIEDVYAVYDAFARDEINTRDELDACLADKGIRRGHLPRQMIAILSQKHKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDRVENHRFLLLKEPKTDRQTLVAGWKGEFHLPRGIFTEAVRDCLIEMGLDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRDLEAWSHSAARRIEDAFAGIENASRENKKKIEQLLQDLSLWETFESKLKVKADKINIAKLKKEILEAKEHPYLDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTDVQEQGSLNVLNHVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGALAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRYPHLPDENFRKMLESWSDPLLDKWPDLHRKVRLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQ AKEMAERIIQA FlavobacteriumWP_060381855 MSSKNESYNKQKTFNHYKQEDKYFFGGFLNNADDNLRQVGKE columnareFKTRINFHNNNELASVFKDYFNKEKSVAKREHALNLLSNYFP (SEQ IDVLERIQKHTNHNFEQTREIFELLLDTIKKLRDYYTHHYHKPITIN No. 185)PKVYDFLDDTLLDVLITIKKKKVKNDTSRELLKEKFRPELTQLKNQKREELIKKGKKLLEENLENAVFNHCLRPFLEENKTDDKQNKTVSLRKYRKSKPNEETSITLTQSGLVFLISFFLHRKEFQVFTSGLEGFKAKVNTIKEEEISLNKNNIVYMITHWSYSYYNFKGLKHRIKTDQGVSTLEQNNTTHSLTNTNTKEALLTQIVDYLSKVPNEIYETLSEKQQKEFEEDINEYMRENPENEDSTFSSIVSHKVIRKRYENKFNYFAMRFLDEYAELPTLRFMVNFGDYIKDRQKKILESIQFDSERIIKKEIHLFEKLGLVTEYKKNVYLKETSNIDLSRFPLFPSPSYVMANNNIPFYIDSRSNNLDEYLNQKKKAQSQNRKRNLTFEKYNKEQSKDAIIAMLQKEIGVKDLQQRSTIGLLSCNELPSMLYEVIVKDIKGAELENKIAQKIREQYQSIRDFTLDSPQKDNIPTTLTKTISTDTSVTFENQPIDIPRLKNALQKELTLTQEKLLNVKQHEIEVDNYNRNKNTYKFKNQPKDKVDDNKLQRKYVFYRNEIGQEANWLASDLIHFMKNKSLWKGYMHNELQSFLAFFEDKKNDCIALLETVFNLKEDCILTKDLKNLFLKHGNFIDFYKEYLKLKEDFLNTESTFLENGFIGLPPKILKKELSKRLNYIFIVFQKRQFIIKELEEKKNNLYADAINLSRGIFDEKPTMIPFKKPNPDEFASWFVASYQYNNYQSFYELTPDKIENDKKKKYKNLRAINKVKIQDYYLKLMVDTLYQDLFNQPLDKSLSDFYVSKTDREKIKADAKAYQKRNDSFLWNKVIHLSLQNNRITANPKLKDIGKYKRALQDEKIATLLTYDDRTWTYALQKPEKENENDYKELHYTALNMELQEYEKVRSKKLLKQVQELEKQILDKFYDFSNNATHPEDLEIEDKKGKRHPNFKLYITKALLKNESEIINLENIDIEILIKYYDYNTEKLKEKIKNMDEDEKAKIVNTKENYNKITNVLIKKALVLIIIRNKMAHNQYPPKFIYDLATRFVPKKEEEYFACY FNRVFETITTELWENKKKAKEIVPorphyromonas WP_061156470 MTEQNERPYNGTYYTLEDKHFWAAFFNLARHNAYITLTHIDRQgingivalis LAYSKADITNDEDILFFKGQWKNLDNDLERKARLRSLILKHFSF (SEQ IDLEGAAYGKKLFENKSSGNKSSKKKELTKKEKEELQANALSLDN No. 186)LKSILFDFLQKLKDFRNYYSHYRHPESSELPLFDGNMLQRLYNVFDVSVQRVKRDHEHNDKVDPHRHFNHLVRKGKKDRCGNNDNPFFKHHFVDREGKVTEAGLLFFVSLFLEKRDAIWMQKKIRGFKGGTEAYQQMTNEVFCRSRISLPKLKLESLRTDDWMLLDMLNELVRCPKSLYDRLREEDRARFRVPVDILSDEDDTDGTEEDPFKNTLVRHQDRFPYFALRYFDLKKVFTSLRFHIDLGTYHFAIYKKNIGEQPEDRHLTRNLYGFGRIQDFAEEHRPEEWKRLVRDLDYFETGDKPYITQTTPHYHIEKGKIGLRFVPEGQHLWPSPEVGATRTGRSKYAQDKRLTAEAFLSVHELMPMMFYYFLLREKYSEEVSAEKVQGRIKRVIEDVYAVYDAFARGEIDTLDRLDACLADKGIRRGHLPRQMIAILSQEHKDMEEKVRKKLQEMIADTDHRLDMLDRQTDRKIRIGRKNAGLPKSGVIADWLVRDMMRFQPVAKDTSGKPLNNSKANSTEYRMLQRALALFGGEKERLTPYFRQMNLTGGNNPHPFLHETRWESHTNILSFYRSYLKARKAFLQSIGRSDREENHRFLLLKEPKTDRQTLVAGWKSEFHLPRGIFTEAVRDCLIEMGYDEVGSYKEVGFMAKAVPLYFERACKDRVQPFYDYPFNVGNSLKPKKGRFLSKEKRAEEWESGKERFRLAKLKKEILEAKEHPYLDFKSWQKFERELRLVKNQDIITWMMCRDLMEENKVEGLDTGTLYLKDIRTEVQEQGSLNVLNRVKPMRLPVVVYRADSRGHVHKEQAPLATVYIEERDTKLLKQGNFKSFVKDRRLNGLFSFVDTGGLAMEQYPISKLRVEYELAKYQTARVCAFEQTLELEESLLTRCPHLPDKNFRKMLESWSDPLLDKWPDLQREVWLLIAVRNAFSHNQYPMYDEAVFSSIRKYDPSSPDAIEERMGLNIAHRLSEEVKQAKEMAERIIQA Porphyromonas WP_061156637MNTVPASENKGQSRTVEDDPQYFGLYLNLARENLIEVESHVRIK gingivalisFGKKKLNEESLKQSLLCDHLLSVDRWTKVYGHSRRYLPFLHYF (SEQ IDDPDSQIEKDHDSKTGVDPDSAQRLIRELYSLLDFLRNDFSHNRL No. 187)DGTTFEHLEVSPDISSFITGTYSLACGRAQSRFADFFKPDDFVLAKNRKEQLISVADGKECLTVSGLAFFICLFLDREQASGMLSRIRGFKRTDENWARAVHETFCDLCIRHPHDRLESSNTKEALLLDMLNELNRCPRILYDMLPEEERAQFLPALDENSMNNLSENSLNEESRLLWDGSSDWAEALTKRIRHQDRFPYLMLRFIEEMDLLKGIRFRVDLGEIELDSYSKKVGRNGEYDRTITDHALAFGKLSDFQNEEEVSRMISGEASYPVRFSLFAPRYAIYDNKIGYCHTSDPVYPKSKTGEKRALSNPQSMGFISVHDLRKLLLMELLCEGSFSRMQSGFLRKANRILDETAEGKLQFSALFPEMRHRFIPPQNPKSKDRREKAETTLEKYKQEIKGRKDKLNSQLLSAFDMNQRQLPSRLLDEWMNIRPASHSVKLRTYVKQLNEDCRLRLRKFRKDGDGKARAIPLVGEMATFLSQDIVRMIISEETKKLITSAYYNEMQRSLAQYAGEENRRQFRAIVAELHLLDPSSGHPFLSATMETAHRYTEDFYKCYLEKKREWLAKTFYRPEQDENTKRRISVFFVPDGEARKLLPLLIRRRMKEQNDLQDWIRNKQAHPIDLPSHLFDSKIMELLKVKDGKKKWNEAFKDWWSTKYPDGMQPFYGLRRELNIHGKSVSYIPSDGKKFADCYTHLMEKTVQDKKRELRTAGKPVPPDLAADIKRSFHRAVNEREFMLRLVQEDDRLMLMAINKMMTDREEDILPGLKNIDSILDKENQFSLAVHAKVLEKEGEGGDNSLSLVPATIEIKSKRKDWSKYIRYRYDRRVPGLMSHFPEHKATLDEVKTLLGEYDRCRIKIFDWAFALEGAIMSDRDLKPYLHESSSREGKSGEHSTLVKMLVEKKGCLTPDESQYLILIRNKAAHNQFPCAAEMPLIYRDVSAKVGSIEGSSAKDLPEGSSLVDSLWKKYEMIIRKILPILDPENRFFGKLLNNMSQPINDL Riemerella WP_061710138MFFSFHNAQRVIFKHLYKAFDASLRMVKEDYKAHFTVNLTRDF anatipestiferAHLNRKGKNKQDNPDFNRYRFEKDGFFTESGLLFFTNLFLDKR (SEQ IDDAYWMLKKVSGFKASHKQSEKMTTEVFCRSRILLPKLRLESRY No. 188)DHNQMLLDMLSELSRCPKLLYEKLSEKDKKCFQVEADGFLDEIEEEQNPFKDTLIRHQDRFPYFALRYLDLNESFKSIRFQVDLGTYHYCIYDKKIGYEQEKRHLTRTLLNFGRLQDFTEINRPQEWKALTKDLDYNETSNQPFISKTTPHYHITDNKIGFRLRTSKELYPSLEVKDGANRIAKYPYNSDFVAHAFISISVHELLPLMFYQHLTGKSEDLLKETVRHIQRIYKDFEEERINTIEDLEKANQGRLPLGAFPKQMLGLLQNKQPDLSEKAKIKIEKLIAETKLLSHRLNTKLKSSPKLGKRREKLIKTGVLADWLVKDFMRFQPVVYDAQNQPIKSSKANSTESRLIRRALALYGGEKNRLEGYFKQTNLIGNTNPHPFLNKFNWKACRNLVDFYQQYLEQREKFLEAIKHQPWEPYQYCLLLKVPKENRKNLVKGWEQGGISLPRGLFTEAIRETLSKDLTLSKPIRKEIKKHGRVGFISRAITLYFKEKYQDKHQSFYNLSYKLEAKAPLLKKEEHYEYWQQNKPQSPTESQRLELHTSDRWKDYLLYKRWQHLEKKLRLYRNQDIMLWLMTLELTKNHFKELNLNYHQLKLENLAVNVQEADAKLNPLNQTLPMVLPVKVYPTTAFGEVQYHETPIRTVYIREEQTKALKMGNFKALVKDRHLNGLFSFIKEENDTQKHPISQLRLRRELEIYQSLRVDAFKETLSLEEKLLNKHASLSSLENEFRTLLEEWKKKYAASSMVTDKHIAFIASVRNAFCHNQYPFYKETLHAPILLFTVAQPTTEEKDGLGIAEALLRVLREYCEIVKSQI Flavobacterium WP_063744070MSSKNESYNKQKTFNHYKQEDKYFFGGFLNNADDNLRQVGKE columnareFKTRINFLASVFKDYFNKEKSVAKREHALNLLSNYFP (SEQ IDVLERIQKHTNHNFEQTREIFELLLDTIKKLRDYYTHHYHKPITIN No. 189)PKIYDFLDDTLLDVLITIKKKKVKNDTSRELLKEKLRPELTQLKNQKREELIKKGKKLLEENLENAVFNHCLRPFLEENKTDDKQNKTVSLRKYRKSKPNEETSITLTQSGLVFLMSFFLHRKEFQVFTSGLEGFKAKVNTIKEEKISLNKNNIVYMITHWSYSYYNFKGLKHRIKTDQGVSTLEQNNTTHSLTNTNTKEALLTQIVDYLSKVPNEIYETLSEKQQKEFEEDINEYMRENPENEDSTFSSIVSHKVIRKRYENKFNYFAMRFLDEYAELPTLRFMVNFGDYIKDRQKKILESIQFDSERIIKKEIHLFEKLGLVTEYKKNVYLKETSNIDLSRFPLFPSPSYVMANNNIPFYIDSRSNNLDEYLNQKKKAQSQNRKRNLTFEKYNKEQSKDAIIAMLQKEIGVKDLQQRSTIGLLSCNELPSMLYEVIVKDIKGAELENKIAQKIREQYQSIRDFTLNSPQKDNIPTTLIKTISTDTSVTFENQPIDIPRLKNAIQKELALTQEKLLNVKQHEIEVNNYNRNKNTYKFKNQPKDKVDDNKLQRKYVFYRNEIGQEANWLASDLIHFMKNKSLWKGYMHNELQSFLAFFEDKKNDCIALLETVFNLKEDCILTKDLKNLFLKHGNFIDFYKEYLKLKEDFLNTESTFLENGFIGLPPKILKKELSKRLNYIFIVFQKRQFIIKELEEKKNNLYADAINLSRGIFDEKPTMIPFKKPNPDEFASWFVASYQYNNYQSFYELTPDKIENDKKKKYKNLRAINKVKIQDYYLKLMVDTLYQDLFNQPLDKSLSDFYVSKTDREKIKADAKAYQKRNDSFLWNKVIHLSLQNNRITANPKLKDIGKYKRALQDEKIATLLTYDDRTWTYALQKPEKENENDYKELHYTALNMELQEYEKVRSKKLLKQVQELEKQILDKFYDFSNNATHPEDLEIEDKKGKRHPNFKLYITKALLKNESEIINLENIDIEILIKYYDYNTEKLKEKIKNMDEDEKAKIVNTKENYNKITNVLIKKALVLIIIRNKMAHNQYPPKFIYDLATRFVPKKEEEYFACY FNRVFETITTELWENKKKAKEIVRiemerel1a WP_064970887 MEKPLPPNVYTLKHKFFWGAFLNIARHNAFITICHINEQLGLTTPanatipestifer PNDDKIADVVCGTWNNILNNDHDLLKKSQLTELILKHFPFLAA (SEQ IDMCYHPPKKEGKKKGSQKEQQKEKENEAQSQAEALNPSELIKVL No. 190)KTLVKQLRTLRNYYSHHSHKKPDAEKDIFKHLYKAFDASLRMVKEDYKAHFTVNLTQDFAHLNRKGKNKQDNPDFDRYRFEKDGFFTESGLLFFTNLFLDKRDAYWMLKKVSGFKASHKQSEKMTTEVFCRSRILLPKLRLESRYDHNQMLLDMLSELSRYPKLLYEKLSEEDKKRFQVEADGFLDEIEEEQNPFKDTLIRHQDRFPYFALRYLDLNESFKSIRFQVDLGTYHYCIYDKKIGDEQEKRHLTRTLLSFGRLQDFTEINRPQEWKALTKDLDYKETSKQPFISKTTPHYHITDNKIGFRLGTSKELYPSLEVKDGANRIAQYPYNSDFVAHAFISVHELLPLMFYQHLTGKSEDLLKETVRHIQRIYKDFEEERINTIEDLEKANQGRLPLGAFPKQMLGLLQNKQPDLSEKAKIKIEKLIAETKLLSHRLNTKLKSSPKLGKRREKLIKTGVLADWLVKDFMRFQPVAYDAQNQPIESSKANSTEFQLIQRALALYGGEKNRLEGYFKQTNLIGNTNPHPFLNKFNWKACRNLVDFYQQYLEQREKFLEAIKNQPWEPYQYCLLLKIPKENRKNLVKGWEQGGISLPRGLFTEAIRETLSKDLTLSKPIRKEIKKHGRVGFISRAITLYFREKYQDDHQSFYDLPYKLEAKASPLPKKEHYEYWQQNKPQSPTELQRLELHTSDRWKDYLLYKRWQHLEKKLRLYRNQDVMLWLMTLELTKNHFKELNLNYHQLKLENLAVNVQEADAKLNPLNQTLPMVLPVKVYPATAFGEVQYQETPIRTVYIREEQTKALKMGNFKALVKDRRLNGLFSFIKEENDTQKHPISQLRLRRELEIYQSLRVDAFKETLNLEEKLLKKHTSLSSVENKFRILLEEWKKEYAASSMVTDEHIAFIASVRNAFCHNQYPFYEEALHAPIPLFTVAQQTTEEKDGLGIAEALLRVLREYCEIV KSQI SinomicrobiumWP_072319476.1 MESTTTLGLHLKYQHDLFEDKHYFGGGVNLAVQNIESIFQAFA oceaniERYGIQNPLRKNGVPAINNIFHDNISISNYKEYLKFLKQYLPVVG (SEQ IDFLEKSNEINIFEFREDFEILINAIYKLRHFYTHYYHSPIKLEDRFYT No. 191)CLNELFVAVAIQVKKHKMKSDKTRQLLNKNLHQLLQQLIEQKREKLKDKKAEGEKVSLDTKSIENAVLNDAFVHLLDKDENIRLNYSSRLSEDIITKNGITLSISGLLFLLSLFLQRKEAEDLRSRIEGFKGKGNELRFMATHWVFSYLNVKRIKHRLNTDFQKETLLIQIADELSKVPDEVYKTLDHENRSKFLEDINEYIREGNEDASLNESTVVHGVIRKRYENKFHYLVLRYLDEFVDFPSLRFQVHLGNYIHDRRDKVIDGTNFITNRVIKEPIKVFGKLSHVSKLKSDYMESLSREHKNGWDVFPNPSYNFVGHNIPIFINLRSASSKGKELYRDLMKIKSEKKKKSREEGIPMERRDGKPTKIEISNQIDRNIKDNNFKDIYPGEPLAMLSLNELPALLFELLRRPSITPQDIEDRMVEKLYERFQIIRDYKPGDGLSTSKISKKLRKADNSTRLDGKKLLRAIQTETRNAREKLHTLEENKALQKNRKRRTVYTTREQGREASWLAQDLKRFMPIASRKEWRGYHHSQLQQILAFYDQNPKQPLELLEQFWDLKEDTYVWNSWIHKSLSQHNGFVPMYEGYLKGRLGYYKKLESDIIGFLEEHKVLKRYYTQQHLNVIFRERLYFIKTETKQKLELLARPLVFPRGIFDDKPTFVQDKKVVDHPELFADWYVYSYKDDHSFQEFYHYKRDYNEIFETELSWDIDFKDNKRQLNPSEQMDLFRMKWDLKIKKIKIQDIFLKIVAEDIYLKIFGHKIPLSLSDFYISRQERLTLDEQAVAQSMRLPGDTSENQIKESNLWQTTVPYEKEQIREPKIKLKDIGKFKYFLQQQKVLNLLKYDPQHVWTKAELEEELYIGKHSYEVVRREMLLQKCHQLEKHILEQFRFDGSNHPRELEQGNHPNFKMYIVNGILTKRGELEIEAENWWLELGNSKNSLDKVEVELLTMKTIPEQKAFLLILIRNKFAHNQLPADNYFHYASNLMNLKKSDTYSLFWFTVADTIVQ EFMSL ReichenbachiellaWP_073124441.1 MKTNPLIASSGEKPNYKKFNTESDKSFKKIFQNKGSIAPIAEKACagariperforans KNFEIKSKSPVNRDGRLHYFSVGHAFKNIDSKNVFRYELDESQM (SEQDMKPTQFLALQKEFFDFQGALNGLLKHIRNVNSHYVHTFEKLEI ID No.QSINQKLITFLIEAFELAVIHSYLNEEELSYEAYKDDPQSGQKLV 192)QFLCDKFYPNKEHEVEERKTILAKNKRQALEHLLFIEVTSDIDWKLFEKHKVFTISNGKYLSFHACLFLLSLFLYKSEANQLISKIKGFKRNDDNQYRSKRQIFTFFSKKFTSQDVNSEEQHLVKFRDVIQYLNHYPSAWNKHLELKSGYPQMTDKLMRYIVEAEIYRSFPDQTDNHRFLLFAIREFFGQSCLDTWTGNTPINFSNQEQKGFSYEINTSAEIKDIETKLKALVLKGPLNFKEKKEQNRLEKDLRREKKEQPTNRVKEKLLTRIQHNMLYVSYGRNQDRFMDFAARFLAETDYFGKDAKFKMYQFYTSDEQRDHLKEQKKELPKKEFEKLKYHQSKLVDYFTYAEQQARYPDWDTPFVVENNAIQIKVTLFNGAKKIVSVQRNLMLYLLEDALYSEKRENAGKGLISGYFVHHQKELKDQLDILEKETEISREQKREFKKLLPKRLLHRYSPAQINDTTEWNPMEVILEEAKAQEQRYQLLLEKAILHQTEEDFLKRNKGKQFKLRFVRKAWHLMYLKELYMNKVAEHGHHKSFHITKEEFNDFCRWMFAFDEVPKYKEYLCDYFSQKGFFNNAEFKDLIESSTSLNDLYEKTKQRFEGWSKDLTKQSDENKYLLANYESMLKDDMLYVNISHFISYLESKGKINRNAHGHIAYKALNNVPHLIEEYYYKDRLAPEEYKSHGKLYNKLKTVKLEDALLYEMAMHYLSLEPALVPKVKTKVKDILSSNIAFDIKDAAGHHLYHLLIPFHKIDSFVALINHQSQQEKDPDKTSFLAKIQPYLEKVKNSKDLKAVYHYYKDTPHTLRYEDLNMIHSHIVSQSVQFTKVALKLEEYFIAKKSITLQIARQISYSEIADLSNYFTDEVRNTAFHFDVPETAYSMILQGIESEFLDREIKPQKPKSLSELSTQQVSVCTAFLETLHNNLFDRKDDKKERLSKARERYFEQIN

In certain example embodiments, the CRISPR effector protein is a Cas13aprotein selected from Table 2.

TABLE 2 c2c2-5 1 LachnospiraceaeMQISKVNHKHVAVGQKDRERITGFIYNDPVGDEKSLEDVVA bacteriumKRANDTKVLFNVFNTKDLYDSQESDKSEKDKEIISKGAKFV MA2020AKSFNSAITILKKQNKIYSTLTSQQVIKELKDKFGGARIYDDD (SEQ IDIEEALTETLKKSFRKENVRNSIKVLIENAAGIRSSLSKDEEELI No. 193)QEYFVKQLVEEYTKTKLQKNVVKSIKNQNMVIQPDSDSQVLSLSESRREKQSSAVSSDTLVNCKEKDVLKAFLTDYAVLDEDERNSLLWKLRNLVNLYFYGSESIRDYSYTKEKSVWKEHDEQKANKTLFIDEICHITKIGKNGKEQKVLDYEENRSRCRKQNINYYRSALNYAKNNTSGIFENEDSNHFWIHLIENEVERLYNGIENGEEFKFETGYISEKVWKAVINHLSIKYIALGKAVYNYAMKELSSPGDIEPGKIDDSYINGITSFDYEIIKAEESLQRDISMNVVFATNYLACATVDTDKDFLLFSKEDIRSCTKKDGNLCKNIMQFWGGYSTWKNFCEEYLKDDKDALELLYSLKSMLYSMRNSSFHFSTENVDNGSWDTELIGKLFEEDCNRAARIEKEKFYNNNLHMFYSSSLLEKVLERLYSSHHERASQVPSFNRVFVRKNFPSSLSEQRITPKFTDSKDEQIWQSAVYYLCKEIYYNDFLQSKEAYKLFREGVKNLDKNDINNQKAADSFKQAVVYYGKAIGNATLSQVCQAIMTEYNRQNNDGLKKKSAYAEKQNSNKYKHYPLFLKQVLQSAFWEYLDENKEIYGFISAQIHKSNVEIKAEDFIANYSSQQYKKLVDKVKKTPELQKWYTLGRLINPRQANQFLGSIRNYVQFVKDIQRRAKENGNPIRNYYEVLESDSIIKILEMCTKLNGTTSNDIHDYFRDEDEYAEYISQFVNFGDVHSGAALNAFCNSESEGKKNGIYYDGINPIVNRNWVLCKLYGSPDLISKITSRVNENMIHDFHKQEDLIREYQIKGICSNKKEQQDLRTFQVLKNRVELRDIVEYSEIINELYGQLIKWCYLRERDLMYFQLGFHYLCLNNASSKEADYIKINVDDRNISGAILYQIAAMYINGLPVYYKKDDMYVALKSGKKASDELNSNEQTSKKINYFLKYGNNILGDKKDQLYLAGLELFENVAEHENIIIFRNEIDHFHYFYDRDRSMLDLYSEVFDRFFTYDMKLRKNVVNMLYNILLDHNIVSSFVFETGEKKVGRGDSEVIKPSAKIRLRANNGVSSDVFTYKVGSKDELKIATLPAKNEEFLLNVARLIYYPDMEAVSENMVREGVVKVEKSNDKKGKISRGSNTRSSNQSKYNNKSKNRMNYSMGSIFEKMDLK FD c2c2-6 2 LachnospiraceaeMKISKVREENRGAKLTVNAKTAVVSENRSQEGILYNDPSRY bacteriumGKSRKNDEDRDRYIESRLKSSGKLYRIFNEDKNKRETDELQ NK4A179WFLSEIVKKINRRNGLVLSDMLSVDDRAFEKAFEKYAELSYT (SEQ IDNRRNKVSGSPAFETCGVDAATAERLKGIISETNFINRIKNNID No. 194)NKVSEDIIDRIIAKYLKKSLCRERVKRGLKKLLMNAFDLPYSDPDIDVQRDFIDYVLEDFYHVRAKSQVSRSIKNMNMPVQPEGDGKFAITVSKGGTESGNKRSAEKEAFKKFLSDYASLDERVRDDMLRRMRRLVVLYFYGSDDSKLSDVNEKFDVWEDHAARRVDNREFIKLPLENKLANGKTDKDAERIRKNTVKELYRNQNIGCYRQAVKAVEEDNNGRYFDDKMLNMFFIHRIEYGVEKIYANLKQVTEFKARTGYLSEKIWKDLINYISIKYIAMGKAVYNYAMDELNASDKKEIELGKISEEYLSGISSFDYELIKAEEMLQRETAVYVAFAARHLSSQTVELDSENSDFLLLKPKGTMDKNDKNKLASNNILNFLKDKETLRDTILQYFGGHSLWTDFPFDKYLAGGKDDVDFLTDLKDVIYSMRNDSFHYATENHNNGKWNKELISAMFEHETERMTVVMKDKFYSNNLPMFYKNDDLKKLLIDLYKDNVERASQVPSFNKVFVRKNFPALVRDKDNLGIELDLKADADKGENELKFYNALYYMFKEIYYNAFLNDKNVRERFITKATKVADNYDRNKERNLKDRIKSAGSDEKKKLREQLQNYIAENDFGQRIKNIVQVNPDYTLAQICQLIMTEYNQQNNGCMQKKSAARKDINKDSYQHYKMLLLVNLRKAFLEFIKENYAFVLKPYKHDLCDKADFVPDFAKYVKPYAGLISRVAGSSELQKWYIVSRFLSPAQANHMLGFLHSYKQYVWDIYRRASETGTEINHSIAEDKIAGVDITDVDAVIDLSVKLCGTISSEISDYFKDDEVYAEYISSYLDFEYDGGNYKDSLNRFCNSDAVNDQKVALYYDGEHPKLNRNIILSKLYGERRFLEKITDRVSRSDIVEYYKLKKETSQYQTKGIFDSEDEQKNIKKFQEMKNIVEFRDLMDYSEIADELQGQLINWIYLRERDLMNFQLGYHYACLNNDSNKQATYVTLDYQGKKNRKINGAILYQICAMYINGLPLYYVDKDSSEWTVSDGKESTGAKIGEFYRYAKSFENTSDCYASGLEIFENISEHDNITELRNYIEHFRYYSSFDRSFLGIYSEVFDRFFTYDLKYRKNVPTILYNILLQHFVNVRFEFVSGKKMIGIDKKDRKIAKEKECARITIREKNGVYSEQFTYKLKNGTVYVDARDKRYLQSIIRLLFYPEKVNMDEMIEVKEKKKPSDNNTGKGYSKRDRQQDRKEYDKY KEKKKKEGNFLSGMGGNINWDEINAQLKNc2c2-7 3 [Clostridium] MKFSKVDHTRSAVGIQKATDSVHGMLYTDPKKQEVNDLDKaminophilum RFDQLNVKAKRLYNVFNQSKAEEDDDEKRFGKVVKKLNRE DSM 10710LKDLLFHREVSRYNSIGNAKYNYYGIKSNPEEIVSNLGMVES SEQ IDLKGERDPQKVISKLLLYYLRKGLKPGTDGLRMILEASCGLRK No. 195)LSGDEKELKVFLQTLDEDFEKKTFKKNLIRSIENQNMAVQPSNEGDPIIGITQGRFNSQKNEEKSAIERMMSMYADLNEDHREDVLRKLRRLNVLYFNVDTEKTEEPTLPGEVDTNPVFEVWHDHEKGKENDRQFATFAKILTEDRETRKKEKLAVKEALNDLKSAIRDHNIMAYRCSIKVTEQDKDGLFFEDQRINRFWIHHIESAVERILASINPEKLYKLRIGYLGEKVWKDLLNYLSIKYIAVGKAVFHFAMEDLGKTGQDIELGKLSNSVSGGLTSFDYEQIRADETLQRQLSVEVAFAANNLFRAVVGQTGKKIEQSKSEENEEDFLLWKAEKIAESIKKEGEGNTLKSILQFFGGASSWDLNHFCAAYGNESSALGYETKFADDLRKAIYSLRNETFHFTTLNKGSFDWNAKLIGDMFSHEAATGIAVERTRFYSNNLPMFYRESDLKRIMDHLYNTYHPRASQVPSFNSVFVRKNFRLFLSNTLNTNTSFDTEVYQKWESGVYYLFKEIYYNSFLPSGDAHHLFFEGLRRIRKEADNLPIVGKEAKKRNAVQDFGRRCDELKNLSLSAICQMIMTEYNEQNNGNRKVKSTREDKRKPDIFQHYKMLLLRTLQEAFAIYIRREEFKFIFDLPKTLYVMKPVEEFLPNWKSGMFDSLVERVKQSPDLQRWYVLCKFLNGRLLNQLSGVIRSYIQFAGDIQRRAKANHNRLYMDNTQRVEYYSNVLEVVDFCIKGTSRFSNVFSDYFRDEDAYADYLDNYLQFKDEKIAEVSSFAALKTFCNEEEVKAGIYMDGENPVMQRNIVMAKLFGPDEVLKNVVPKVTREEIEEYYQLEKQIAPYRQNGYCKSEEDQKKLLRFQRIKNRVEFQTITEFSEIINELLGQLISWSFLRERDLLYFQLGFHYLCLHNDTEKPAEYKEISREDGTVIRNAILHQVAAMYVGGLPVYTLADKKLAAFEKGEADCKLSISKDTAGAGKKIKDFFRYSKYVLIKDRMLTDQNQKYTIYLAGLELFENTDEHDNITDVRKYVDHFKYYATSDENAMSILDLYSEIHDRFFTYDMKYQKNVANMLENILLRHFVLIRPEFFTGSKKVGEGKKITCKARAQIEIAENGMRSEDFTYKLSDGKKNISTCMIAARDQKYLNTVARLLYYPHEAKKSIVDTREKKNNKKTNRGDGTFNKQKGTARKEKDNGPREFNDTGF SNTPFAGFDPFRNS c2c2-8 5Carnobacterium MRITKVKIKLDNKLYQVTMQKEEKYGTLKLNEESRKSTAEIL gallinarumRLKKASFNKSFHSKTINSQKENKNATIKKNGDYISQIFEKLVG DSM 4847VDTNKNIRKPKMSLTDLKDLPKKDLALFIKRKFKNDDIVEIK (SEQ IDNLDLISLFYNALQKVPGEHFTDESWADFCQEMMPYREYKNK No. 196)FIERKIILLANSIEQNKGFSINPETFSKRKRVLHQWAIEVQERGDFSILDEKLSKLAEIYNFKKMCKRVQDELNDLEKSMKKGKNPEKEKEAYKKQKNFKIKTIWKDYPYKTHIGLIEKIKENEELNQFNIEIGKYFEHYFPIKKERCTEDEPYYLNSETIATTVNYQLKNALISYLMQIGKYKQFGLENQVLDSKKLQEIGIYEGFQTKFMDACVFATSSLKNIIEPMRSGDILGKREFKEAIATSSFVNYHHFFPYFPFELKGMKDRESELIPFGEQTEAKQMQNIWALRGSVQQIRNEIFHSFDKNQKFNLPQLDKSNFEFDASENSTGKSQSYIETDYKFLFEAEKNQLEQFFIERIKSSGALEYYPLKSLEKLFAKKEMKFSLGSQVVAFAPSYKKLVKKGHSYQTATEGTANYLGLSYYNRYELKEESFQAQYYLLKLIYQYVFLPNFSQGNSPAFRETVKAILRINKDEARKKMKKNKKFLRKYAFEQVREMEFKETPDQYMSYLQSEMREEKVRKAEKNDKGFEKNITMNFEKLLMQIFVKGFDVFLTTFAGKELLLSSEEKVIKETEISLSKKINEREKTLKASIQVEHQLVATNSAISYWLFCKLLDSRHLNELRNEMIKFKQSRIKFNHTQHAELIQNLLPIVELTILSNDYDEKNDSQNVDVSAYFEDKSLYETAPYVQTDDRTRVSFRPILKLEKYHTKSLIEALLKDNPQFRVAATDIQEWMHKREEIGELVEKRKNLHTEWAEGQQTLGAEKREEYRDYCKKIDRFNWKANKVTLTYLSQLHYLITDLLGRMVGFSALFERDLVYFSRSFSELGGETYHISDYKNLSGVLRLNAEVKPIKIKNIKVIDNEENPYKGNEPEVKPFLDRLHAYLENVIGIKAVHGKIRNQTAHLSVLQLELSMIESMNNLRDLMAYDRKLKNAVTKSMIKILDKHGMILKLKIDENHKNFEIESLIPKEIIHLKDKAIKTNQVSEEYCQLVLALLTTNPGNQLN c2c2-9 6 CarnobacteriumMRMTKVKINGSPVSMNRSKLNGHLVWNGTTNTVNILTKKE gallinarumQSFAASFLNKTLVKADQVKGYKVLAENIFIIFEQLEKSNSEKP DSM 4847SVYLNNIRRLKEAGLKRFFKSKYHEEIKYTSEKNQSVPTKLN (SEQ IDLIPLFFNAVDRIQEDKFDEKNWSYFCKEMSPYLDYKKSYLNR No. 197)KKEILANSIQQNRGFSMPTAEEPNLLSKRKQLFQQWAMKFQESPLIQQNNFAVEQFNKEFANKINELAAVYNVDELCTAITEKLMNFDKDKSNKTRNFEIKKLWKQHPHNKDKALIKLFNQEGNEALNQFNIELGKYFEHYFPKTGKKESAESYYLNPQTIIKTVGYQLRNAFVQYLLQVGKLHQYNKGVLDSQTLQEIGMYEGFQTKFMDACVFASSSLRNIIQATTNEDILTREKFKKELEKNVELKHDLFFKTEIVEERDENPAKKIAMTPNELDLWAIRGAVQRVRNQIFHQQINKRHEPNQLKVGSFENGDLGNVSYQKTIYQKLFDAEIKDIEIYFAEKIKSSGALEQYSMKDLEKLFSNKELTLSLGGQVVAFAPSYKKLYKQGYFYQNEKTIELEQFTDYDFSNDVFKANYYLIKLIYHYVFLPQFSQANNKLFKDTVHYVIQQNKELNTTEKDKKNNKKIRKYAFEQVKLMKNESPEKYMQYLQREMQEERTIKEAKKTNEEKPNYNFEKLLIQIFIKGFDTFLRNFDLNLNPAEELVGTVKEKAEGLRKRKERIAKILNVDEQIKTGDEEIAFWIFAKLLDARHLSELRNEMIKFKQSSVKKGLIKNGDLIEQMQPILELCILSNDSESMEKESFDKIEVFLEKVELAKNEPYMQEDKLTPVKFRFMKQLEKYQTRNFIENLVIENPEFKVSEKIVLNWHEEKEKIADLVDKRTKLHEEWASKAREIEEYNEKIKKNKSKKLDKPAEFAKFAEYKIICEAIENFNRLDHKVRLTYLKNLHYLMIDLMGRMVGFSVLFERDFVYMGRSYSALKKQSIYLNDYDTFANIRDWEVNENKHLFGTSSSDLTFQETAEFKNLKKPMENQLKALLGVTNHSFEIRNNIAHLHVLRNDGKGEGVSLLSCMNDLRKLMSYDRKLKNAVTKAIIKILDKHGMILKLTNNDHTKPFEIESLKPKKIIHLEKSNHSFPMDQVSQEYCDLVKKMLVFTN c2c2- 7 PaludibacterMRVSKVKVKDGGKDKMVLVHRKTTGAQLVYSGQPVSNET 10 propionicigenesSNILPEKKRQSFDLSTLNKTIIKFDTAKKQKLNVDQYKIVEKI WB4FKYPKQELPKQIKAEEILPFLNHKFQEPVKWKNGKEESFNL (SEQ IDTLLIVEAVQAQDKRKLQPYYDWKTWYIQTKSDLLKKSIENN No. 198)RIDLTENLSKRKKALLAWETEFTASGSIDLTHYHKVYMTDVLCKMLQDVKPLTDDKGKINTNAYHRGLKKALQNHQPAIFGTREVPNEANRADNQLSIYHLEVVKYLEHYFPIKTSKRRNTADDIAHYLKAQTLKTTIEKQLVNAIRANIIQQGKTNHHELKADTTSNDLIRIKTNEAFVLNLTGTCAFAANNIRNMVDNEQTNDILGKGDFIKSLLKDNTNSQLYSFFFGEGLSTNKAEKETQLWGIRGAVQQIRNNVNHYKKDALKTVFNISNFENPTITDPKQQTNYADTIYKARFINELEKIPEAFAQQLKTGGAVSYYTIENLKSLLTTFQFSLCRSTIPFAPGFKKVFNGGINYQNAKQDESFYELMLEQYLRKENFAEESYNARYFMLKLIYNNLFLPGFTTDRKAFADSVGFVQMQNKKQAEKVNPRKKEAYAFEAVRPMTAADSIADYMAYVQSELMQEQNKKEEKVAEETRINFEKFVLQVFIKGFDSFLRAKEFDFVQMPQPQLTATASNQQKADKLNQLEASITADCKLTPQYAKADDATHIAFYVFCKLLDAAHLSNLRNELIKFRESVNEFKFHHLLEIIEICLLSADVVPTDYRDLYSSEADCLARLRPFIEQGADITNWSDLFVQSDKHSPVIHANIELSVKYGTTKLLEQIINKDTQFKTTEANFTAWNTAQKSIEQLIKQREDHHEQWVKAKNADDKEKQERKREKSNFAQKFIEKHGDDYLDICDYINTYNWLDNKMHFVHLNRLHGLTIELLGRMAGFVALFDRDFQFFDEQQIADEFKLHGFVNLHSIDKKLNEVPTKKIKEIYDIRNKIIQINGNKINESVRANLIQFISSKRNYYNNAFLHVSNDEIKEKQMYDIRNHIAHFNYLTKDAADFSLIDLINELRELLHYDRKLKNAVSKAFIDLFDKHGMILKLKLNADHKLKVESLEPKKIYHLGSSAKDKPEYQYCTNQVMMAYCNMCRSLLEMKK c2c2- 9 ListeriaMLALLHQEVPSQKLHNLKSLNTESLTKLFKPKFQNMISYPPS 11 weihenstephanensisKGAEHVQFCLTDIAVPAIRDLDEIKPDWGIFFEKLKPYTDWA FSL R9-ESYIHYKQTTIQKSIEQNKIQSPDSPRKLVLQKYVTAFLNGEP 0317 (SEQLGLDLVAKKYKLADLAESFKVVDLNEDKSANYKIKACLQQ ID No. 199)HQRNILDELKEDPELNQYGIEVKKYIQRYFPIKRAPNRSKHARADFLKKELIESTVEQQFKNAVYHYVLEQGKMEAYELTDPKTKDLQDIRSGEAFSFKFINACAFASNNLKMILNPECEKDILGKGDFKKNLPNSTTQSDVVKKMIPFFSDEIQNVNFDEAIWAIRGSIQQIRNEVYHCKKHSWKSILKIKGFEFEPNNMKYTDSDMQKLMDKDIAKIPDFIEEKLKSSGIIRFYSHDKLQSIWEMKQGFSLLTTNAPFVPSFKRVYAKGHDYQTSKNRYYDLGLTTFDILEYGEEDFRARYFLTKLVYYQQFMPWFTADNNAFRDAANFVLRLNKNRQQDAKAFINIREVEEGEMPRDYMGYVQGQIAIHEDSTEDTPNHFEKFISQVFIKGFDSHMRSADLKFIKNPRNQGLEQSEIEEMSFDIKVEPSFLKNKDDYIAFWTFCKMLDARHLSELRNEMIKYDGHLTGEQEIIGLALLGVDSRENDWKQFFSSEREYEKIMKGYVGEELYQREPYRQSDGKTPILFRGVEQARKYGTETVIQRLFDASPEFKVSKCNITEWERQKETIEETIERRKELHNEWEKNPKKPQNNAFFKEYKECCDAIDAYNWHKNKTTLVYVNELHHLLIEILGRYVGYVAIADRDFQCMANQYFKHSGITERVEYWGDNRLKSIKKLDTFLKKEGLFVSEKNARNHIAHLNYLSLKSECTLLYLSERLREIFKYDRKLKNAVSKSLIDILDRHGMSVVFANLKENKHRLVIKSLEPKKLRHLGEKKIDNGYIETNQVSEEY CGIVKRLLEI c2c2- 10Listeriaceae MKITKMRVDGRTIVMERTSKEGQLGYEGIDGNKTTEIIFDKK 12 bacteriumKESFYKSILNKTVRKPDEKEKNRRKQAINKAINKEITELMLA FSL M6-VLHQEVPSQKLHNLKSLNTESLTKLFKPKFQNMISYPPSKGA 0635 =EHVQFCLTDIAVPAIRDLDEIKPDWGIFFEKLKPYTDWAESYI ListeriaHYKQTTIQKSIEQNKIQSPDSPRKLVLQKYVTAFLNGEPLGL newyorkensis FSLDLVAKKYKLADLAESFKLVDLNEDKSANYKIKACLQQHQR M6-0635NILDELKEDPELNQYGIEVKKYIQRYFPIKRAPNRSKHARADF (SEQ IDLKKELIESTVEQQFKNAVYHYVLEQGKMEAYELTDPKTKDL No. 200)QDIRSGEAFSFKFINACAFASNNLKMILNPECEKDILGKGNFKKNLPNSTTRSDVVKKMIPFFSDELQNVNFDEAIWAIRGSIQQIRNEVYHCKKHSWKSILKIKGFEFEPNNMKYADSDMQKLMDKDIAKIPEFIEEKLKSSGVVRFYRHDELQSIWEMKQGFSLLTTNAPFVPSFKRVYAKGHDYQTSKNRYYNLDLTTFDILEYGEEDFRARYFLTKLVYYQQFMPWFTADNNAFRDAANFVLRLNKNRQQDAKAFINIREVEEGEMPRDYMGYVQGQIAIHEDSIEDTPNHFEKFISQVFIKGFDRHMRSANLKFIKNPRNQGLEQSEIEEMSFDIKVEPSFLKNKDDYIAFWIFCKMLDARHLSELRNEMIKYDGHLTGEQEIIGLALLGVDSRENDWKQFFSSEREYEKIMKGYVVEELYQREPYRQSDGKTPILFRGVEQARKYGTETVIQRLFDANPEFKVSKCNLAEWERQKETIEETIKRRKELHNEWAKNPKKPQNNAFFKEYKECCDAIDAYNWHKNKTTLAYVNELHHLLIEILGRYVGYVAIADRDFQCMANQYFKHSGITERVEYWGDNRLKSIKKLDTFLKKEGLFVSEKNARNHIAHLNYLSLKSECTLLYLSERLREIFKYDRKLKNAVSKSLIDILDRHGMSVVFANLKENKHRLVIKSLEPKKLRHLGGKKIDGGYIETNQVSEEYCGI VKRLLEM c2c2- 12 LeptotrichiaMKVTKVDGISHKKYIEEGKLVKSTSEENRTSERLSELLSIRLD 13 wadeiIYIKNPDNASEEENRIRRENLKKFFSNKVLHLKDSVLYLKNR F0279KEKNAVQDKNYSEEDISEYDLKNKNSFSVLKKILLNEDVNSE (SEQ IDELEIFRKDVEAKLNKINSLKYSFEENKANYQKINENNVEKVG No. 201)GKSKRNIIYDYYRESAKRNDYINNVQEAFDKLYKKEDIEKLFFLIENSKKHEKYKIREYYHKIIGRKNDKENFAKIIYEEIQNVNNIKELIEKIPDMSELKKSQVFYKYYLDKEELNDKNIKYAFCHFVEIEMSQLLKNYVYKRLSNISNDKIKRIFEYQNLKKLIENKLLNKLDTYVRNCGKYNYYLQVGEIATSDFIARNRQNEAFLRNIIGVSSVAYFSLRNILETENENDITGRMRGKTVKNNKGEEKYVSGEVDKIYNENKQNEVKENLKMFYSYDFNMDNKNEIEDFFANIDEAISSIRHGIVHFNLELEGKDIFAFKNIAPSEISKKMFQNEINEKKLKLKIFKQLNSANVFNYYEKDVIIKYLKNTKFNFVNKNIPFVPSFTKLYNKIEDLRNTLKFFWSVPKDKEEKDAQIYLLKNIYYGEFLNKFVKNSKVFFKITNEVIKINKQRNQKTGHYKYQKFENIEKTVPVEYLAIIQSREMINNQDKEEKNTYIDFIQQIFLKGFIDYLNKNNLKYIESNNNNDNNDIFSKIKIKKDNKEKYDKILKNYEKHNRNKEIPHEINEFVREIKLGKILKYTENLNMFYLILKLLNHKELTNLKGSLEKYQSANKEETFSDELELINLLNLDNNRVTEDFELEANEIGKFLDFNENKIKDRKELKKFDTNKIYFDGENIIKHRAFYNIKKYGMLNLLEKIADKAKYKISLKELKEYSNKKNEIEKNYTMQQNLHRKYARPKKDEKFNDEDYKEYEKAIGNIQKYTHLKNKVEFNELNLLQGLLLKILHRLVGYTSIWERDLRFRLKGEFPENHYIEEIFNFDNSKNVKYKSGQIVEKYINFYKELYKDNVEKRSIYSDKKVKKLKQEKKDLYIRNYIAHFNYIPHAEISLLEVLENLRKLLSYDRKLKNAIMKSIVDILKEYGFVATFKIGADKKIEIQTLESEKIVHLKNLKKKKLMTDRNSEELCELVK VMFEYKALE c2c2- 15Rhodobacter MQIGKVQGRTISEFGDPAGGLKRKISTDGKNRKELPAHLSSD 14 capsulatusPKALIGQWISGIDKIYRKPDSRKSDGKAIHSPTPSKMQFDARD SB 1003DLGEAFWKLVSEAGLAQDSDYDQFKRRLHPYGDKFQPADS (SEQ IDGAKLKFEADPPEPQAFHGRWYGAMSKRGNDAKELAAALYE No. 202)HLHVDEKRIDGQPKRNPKTDKFAPGLVVARALGIESSVLPRGMARLARNWGEEEIQTYFVVDVAASVKEVAKAAVSAAQAFDPPRQVSGRSLSPKVGFALAEHLERVTGSKRCSFDPAAGPSVLALHDEVKKTYKRLCARGKNAARAFPADKTELLALMRHTHENRVRNQMVRMGRVSEYRGQQAGDLAQSHYWTSAGQTEIKESEIFVRLWVGAFALAGRSMKAWIDPMGKIVNTEKNDRDLTAAVNIRQVISNKEMVAEAMARRGIYFGETPELDRLGAEGNEGFVFALLRYLRGCRNQTFHLGARAGFLKEIRKELEKTRWGKAKEAEHVVLTDKTVAAIRAIIDNDAKALGARLLADLSGAFVAHYASKEHFSTLYSEIVKAVKDAPEVSSGLPRLKLLLKRADGVRGYVHGLRDTRKHAFATKLPPPPAPRELDDPATKARYIALLRLYDGPFRAYASGITGTALAGPAARAKEAATALAQSVNVTKAYSDVMEGRTSRLRPPNDGETLREYLSALTGETATEFRVQIGYESDSENARKQAEFIENYRRDMLAFMFEDYIRAKGFDWILKIEPGATAMTRAPVLPEPIDTRGQYEHWQAALYLVMHFVPASDVSNLLHQLRKWEALQGKYELVQDGDATDQADARREALDLVKRFRDVLVLFLKTGEARFEGRAAPFDLKPFRALFANPATFDRLFMATPTTARPAEDDPEGDGASEPELRVARTLRGLRQIARYNHMAVLSDLFAKHKVRDEEVARLAEIEDETQEKSQIVAAQELRTDLHDKVMKCHPKTISPEERQSYAAAIKTIEEHRFLVGRVYLGDHLRLHRLMMDVIGRLIDYAGAYERDTGTFLINASKQLGAGADWAVTIAGAANTDARTQTRKDLAHFNVLDRADGTPDLTALVNRAREMMAYDRKRKNAVPRSILDMLARLGLTLKWQMKDHLLQDATITQAAIKHLDKVRLTVGGPAAVTEARFSQDYLQMVAAVFNGSVQNPKPRRRDDGDAWHKPPKPATAQSQPDQKPPNKAPSAGSRLPPPQVGEVYEGVVVKVIDTGSLGFLAVEGVAGNIGLHISRLRRIREDAIIVGRRYRFRVEIYVPPKSNTSKL NAADLVRID c2c2- 16Rhodobacter MQIGKVQGRTISEFGDPAGGLKRKISTDGKNRKELPAHLSSD 15 capsulatusPKALIGQWISGIDKIYRKPDSRKSDGKAIHSPTPSKMQFDARD R121 (SEQDLGEAFWKLVSEAGLAQDSDYDQFKRRLHPYGDKFQPADS ID No. 203)GAKLKFEADPPEPQAFHGRWYGAMSKRGNDAKELAAALYEHLHVDEKRIDGQPKRNPKTDKFAPGLVVARALGIESSVLPRGMARLARNWGEEEIQTYFVVDVAASVKEVAKAAVSAAQAFDPPRQVSGRSLSPKVGFALAEHLERVTGSKRCSFDPAAGPSVLALHDEVKKTYKRLCARGKNAARAFPADKTELLALMRHTHENRVRNQMVRMGRVSEYRGQQAGDLAQSHYWTSAGQTEIKESEIFVRLWVGAFALAGRSMKAWIDPMGKIVNTEKNDRDLTAAVNIRQVISNKEMVAEAMARRGIYFGETPELDRLGAEGNEGFVFALLRYLRGCRNQTFHLGARAGFLKEIRKELEKTRWGKAKEAEHVVLTDKTVAAIRAIIDNDAKALGARLLADLSGAFVAHYASKEHFSTLYSEIVKAVKDAPEVSSGLPRLKLLLKRADGVRGYVHGLRDTRKHAFATKLPPPPAPRELDDPATKARYIALLRLYDGPFRAYASGITGTALAGPAARAKEAATALAQSVNVTKAYSDVMEGRSSRLRPPNDGETLREYLSALTGETATEFRVQIGYESDSENARKQAEFIENYRRDMLAFMFEDYIRAKGFDWILKIEPGATAMTRAPVLPEPIDTRGQYEHWQAALYLVMHFVPASDVSNLLHQLRKWEALQGKYELVQDGDATDQADARREALDLVKRFRDVLVLFLKTGEARFEGRAAPFDLKPFRALFANPATFDRLFMATPTTARPAEDDPEGDGASEPELRVARTLRGLRQIARYNHMAVLSDLFAKHKVRDEEVARLAEIEDETQEKSQIVAAQELRTDLHDKVMKCHPKTISPEERQSYAAAIKTIEEHRFLVGRVYLGDHLRLHRLMMDVIGRLIDYAGAYERDTGTFLINASKQLGAGADWAVTIAGAANTDARTQTRKDLAHFNVLDRADGTPDLTALVNRAREMMAYDRKRKNAVPRSILDMLARLGLTLKWQMKDHLLQDATITQAAIKHLDKVRLTVGGPAAVTEARFSQDYLQMVAAVFNGSVQNPKPRRRDDGDAWHKPPKPATAQSQPDQKPPNKAPSAGSRLPPPQVGEVYEGVVVKVIDTGSLGFLAVEGVAGNIGLHISRLRRIREDAIIVGRRYRFRVEIYVPPKSNTSKL NAADLVRID c2c2- 17Rhodobacter MQIGKVQGRTISEFGDPAGGLKRKISTDGKNRKELPAHLSSD 16 capsulatusPKALIGQWISGIDKIYRKPDSRKSDGKAIHSPTPSKMQFDARD DE442DLGEAFWKLVSEAGLAQDSDYDQFKRRLHPYGDKFQPADS (SEQ IDGAKLKFEADPPEPQAFHGRWYGAMSKRGNDAKELAAALYE No. 204)HLHVDEKRIDGQPKRNPKTDKFAPGLVVARALGIESSVLPRGMARLARNWGEEEIQTYFVVDVAASVKEVAKAAVSAAQAFDPPRQVSGRSLSPKVGFALAEHLERVTGSKRCSFDPAAGPSVLALHDEVKKTYKRLCARGKNAARAFPADKTELLALMRHTHENRVRNQMVRMGRVSEYRGQQAGDLAQSHYWTSAGQTEIKESEIFVRLWVGAFALAGRSMKAWIDPMGKIVNTEKNDRDLTAAVNIRQVISNKEMVAEAMARRGIYFGETPELDRLGAEGNEGFVFALLRYLRGCRNQTFHLGARAGFLKEIRKELEKTRWGKAKEAEHVVLTDKTVAAIRAIIDNDAKALGARLLADLSGAFVAHYASKEHFSTLYSEIVKAVKDAPEVSSGLPRLKLLLKRADGVRGYVHGLRDTRKHAFATKLPPPPAPRELDDPATKARYIALLRLYDGPFRAYASGITGTALAGPAARAKEAATALAQSVNVTKAYSDVMEGRSSRLRPPNDGETLREYLSALTGETATEFRVQIGYESDSENARKQAEFIENYRRDMLAFMFEDYIRAKGFDWILKIEPGATAMTRAPVLPEPIDTRGQYEHWQAALYLVMHFVPASDVSNLLHQLRKWEALQGKYELVQDGDATDQADARREALDLVKRFRDVLVLFLKTGEARFEGRAAPFDLKPFRALFANPATFDRLFMATPTTARPAEDDPEGDGASEPELRVARTLRGLRQIARYNHMAVLSDLFAKHKVRDEEVARLAEIEDETQEKSQIVAAQELRTDLHDKVMKCHPKTISPEERQSYAAAIKTIEEHRFLVGRVYLGDHLRLHRLMMDVIGRLIDYAGAYERDTGTFLINASKQLGAGADWAVTIAGAANTDARTQTRKDLAHFNVLDRADGTPDLTALVNRAREMMAYDRKRKNAVPRSILDMLARLGLTLKWQMKDHLLQDATITQAAIKHLDKVRLTVGGPAAVTEARFSQDYLQMVAAVFNGSVQNPKPRRRDDGDAWHKPPKPATAQSQPDQKPPNKAPSAGSRLPPPQVGEVYEGVVVKVIDTGSLGFLAVEGVAGNIGLHISRLRRIREDAIIVGRRYRFRVEIYVPPKSNTSKL NAADLVRID c2c2-2 (SEQ IDMGNLFGHKRWYEVRDKKDFKIKRKVKVKRNYDGNKYILNI No. 205)NENNNKEKIDNNKFIRKYINYKKNDNILKEFTRKFHAGNILFKLKGKEGIIRIENNDDFLETEEVVLYIEAYGKSEKLKALGITKKKIIDEAIRQGITKDDKKIEIKRQENEEEIEIDIRDEYTNKTLNDCSIILRIIENDELETKKSIYEIFKNINMSLYKIIEKIIENETEKVFENRYYEEHLREKLLKDDKIDVILTNFMEIREKIKSNLEILGFVKFYLNVGGDKKKSKNKKMLVEKILNINVDLTVEDIADFVIKELEFWNITKRIEKVKKVNNEFLEKRRNRTYIKSYVLLDKHEKFKIERENKKDKIVKFFVENIKNNSIKEKIEKILAEFKIDELIKKLEKELKKGNCDTEIFGIFKKHYKVNFDSKKFSKKSDEEKELYKIIYRYLKGRIEKILVNEQKVRLKKMEKIEIEKILNESILSEKILKRVKQYTLEHIMYLGKLRHNDIDMTTVNTDDFSRLHAKEELDLELITFFASTNMELNKIFSRENINNDENIDFFGGDREKNYVLDKKILNSKIKIIRDLDFIDNKNNITNNFIRKFTKIGTNERNRILHAISKERDLQGTQDDYNKVINIIQNLKISDEEVSKALNLDVVFKDKKNIITKINDIKISEENNNDIKYLPSFSKVLPEILNLYRNNPKNEPFDTIETEKIVLNALIYVNKELYKKLILEDDLEENESKNIFLQELKKTLGNIDEIDENIIENYYKNAQISASKGNNKAIKKYQKKVIECYIGYLRKNYEELFDFSDFKMNIQEIKKQIKDINDNKTYERITVKTSDKTIVINDDFEYIISIFALLNSNAVINKIRNRFFATSVWLNTSEYQNIIDILDEIMQLNTLRNECITENWNLNLEEFIQKMKEIEKDFDDFKIQTKKEIFNNYYEDIKNNILTEFKDDINGCDVLEKKLEKIVIFDDETKFEIDKKSNILQDEQRKLSNINKKDLKKKVDQYIKDKDQEIKSKILCRIIFNSDFLKKYKKEIDNLIEDMESENENKFQEIYYPKERKNELYIYKKNLFLNIGNPNFDKIYGLISNDIKMADAKFLFNIDGKNIRKNKISEIDAILKNLNDKLNGYSKEYKEKYIKKLKENDDFFAKNIQNKNYKSFEKDYNRVSEYKKIRDLVEFNYLNKIESYLIDINWKLAIQMARFERDMHYIVNGLRELGIIKLSGYNTGISRAYPKRNGSDGFYTTTAYYKFFDEESYKKFEKICYGFGIDLSENSEINKPENESIRNYISHFYIVRNPFADYSIAEQIDRVSNLLSYSTRYNNSTYASVFEVFKKDVNLDYDELKKKFKLIGNNDILERLMKPKKVSVLELESYNSDYIKNLIIELLTKI ENTNDTL c2c2-3 L wadeiMKVTKVDGISHKKYIEEGKLVKSTSEENRTSERLSELLSIRLD (Lw2)IYIKNPDNASEEENRIRRENLKKFFSNKVLHLKDSVLYLKNR (SEQ IDKEKNAVQDKNYSEEDISEYDLKNKNSFSVLKKILLNEDVNSE No. 206)ELEIFRKDVEAKLNKINSLKYSFEENKANYQKINENNVEKVGGKSKRNIIYDYYRESAKRNDYINNVQEAFDKLYKKEDIEKLFFLIENSKKHEKYKIREYYHKIIGRKNDKENFAKIIYEEIQNVNNIKELIEKIPDMSELKKSQVFYKYYLDKEELNDKNIKYAFCHFVEIEMSQLLKNYVYKRLSNISNDKIKRIFEYQNLKKLIENKLLNKLDTYVRNCGKYNYYLQVGEIATSDFIARNRQNEAFLRNIIGVSSVAYFSLRNILETENENDITGRMRGKTVKNNKGEEKYVSGEVDKIYNENKQNEVKENLKMFYSYDFNMDNKNEIEDFFANIDEAISSIRHGIVHFNLELEGKDIFAFKNIAPSEISKKMFQNEINEKKLKLKIFKQLNSANVFNYYEKDVIIKYLKNTKFNFVNKNIPFVPSFTKLYNKIEDLRNTLKFFWSVPKDKEEKDAQIYLLKNIYYGEFLNKFVKNSKVFFKITNEVIKINKQRNQKTGHYKYQKFENIEKTVPVEYLAIIQSREMINNQDKEEKNTYIDFIQQIFLKGFIDYLNKNNLKYIESNNNNDNNDIFSKIKIKKDNKEKYDKILKNYEKHNRNKEIPHEINEFVREIKLGKILKYTENLNMFYLILKLLNHKELTNLKGSLEKYQSANKEETFSDELELINLLNLDNNRVTEDFELEANEIGKFLDFNENKIKDRKELKKFDTNKIYFDGENIIKHRAFYNIKKYGMLNLLEKIADKAKYKISLKELKEYSNKKNEIEKNYTMQQNLHRKYARPKKDEKFNDEDYKEYEKAIGNIQKYTHLKNKVEFNELNLLQGLLLKILHRLVGYTSIWERDLRFRLKGEFPENHYIEEIFNFDNSKNVKYKSGQIVEKYINFYKELYKDNVEKRSIYSDKKVKKLKQEKKDLYIRNYIAHFNYIPHAEISLLEVLENLRKLLSYDRKLKNAIMKSIVDILKEYGFVATFKIGADKKIEIQTLESEKIVHLKNLKKKKLMTDRNSEELCELVKVMFEYKALEKRPAATKKAGQAKKKKGSYPYDVPDYAYPY DVPDYAYPYDVPDYA* c2c2-4 ListeriaMWISIKTLIHHLGVLFFCDYMYNRREKKIIEVKTMRITKVEV seeligeriDRKKVLISRDKNGGKLVYENEMQDNTEQIMHHKKSSFYKS (SEQ IDVVNKTICRPEQKQMKKLVHGLLQENSQEKIKVSDVTKLNIS No. 207)NFLNHRFKKSLYYFPENSPDKSEEYRIEINLSQLLEDSLKKQQGTFICWESFSKDMELYINWAENYISSKTKLIKKSIRNNRIQSTESRSGQLMDRYMKDILNKNKPFDIQSVSEKYQLEKLTSALKATFKEAKKNDKEINYKLKSTLQNHERQIIEELKENSELNQFNIEIRKHLETYFPIKKTNRKVGDIRNLEIGEIQKIVNHRLKNKIVQRILQEGKLASYEIESTVNSNSLQKIKIEEAFALKFINACLFASNNLRNMVYPVCKKDILMIGEFKNSFKEIKHKKFIRQWSQFFSQEITVDDIELASWGLRGAIAPIRNEIIHLKKHSWKKFFNNPTFKVKKSKIINGKTKDVTSEFLYKETLFKDYFYSELDSVPELIINKMESSKILDYYSSDQLNQVFTIPNFELSLLTSAVPFAPSFKRVYLKGFDYQNQDEAQPDYNLKLNIYNEKAFNSEAFQAQYSLFKMVYYQVFLPQFTTNNDLFKSSVDFILTLNKERKGYAKAFQDIRKMNKDEKPSEYMSYIQSQLMLYQKKQEEKEKINHFEKFINQVFIKGFNSFIEKNRLTYICHPTKNTVPENDNIEIPFHTDMDDSNIAFWLMCKLLDAKQLSELRNEMIKFSCSLQSTEEISTFTKAREVIGLALLNGEKGCNDWKELFDDKEAWKKNMSLYVSEELLQSLPYTQEDGQTPVINRSIDLVKKYGTETILEKLFSSSDDYKVSAKDIAKLHEYDVTEKIAQQESLHKQWIEKPGLARDSAWTKKYQNVINDISNYQWAKTKVELTQVRHLHQLTIDLLSRLAGYMSIADRDFQFSSNYILERENSEYRVTSWILLSENKNKNKYNDYELYNLKNASIKVSSKNDPQLKVDLKQLRLTLEYLELFDNRLKEKRNNISHFNYLNGQLGNSILELFDDARDVLSYDRKLKNAVSKSLKEILSSHGMEVTFKPLYQTNHHLKIDKLQPKKIHHLG EKSTVSSNQVSNEYCQLVRTLLTMKC2-17 Leptotrichia MKVTKVGGISHKKYTSEGRLVKSESEENRTDERLSALLNMR buccalisLDMYIKNPSSTETKENQKRIGKLKKFFSNKMVYLKDNTLSL C-1013-bKNGKKENIDREYSETDILESDVRDKKNFAVLKKIYLNENVNS (SEQ IDEELEVFRNDIKKKLNKINSLKYSFEKNKANYQKINENNIEKV No. 208)EGKSKRNIIYDYYRESAKRDAYVSNVKEAFDKLYKEEDIAKLVLEIENLTKLEKYKIREFYHEIIGRKNDKENFAKIIYEEIQNVNNMKELIEKVPDMSELKKSQVFYKYYLDKEELNDKNIKYAFCHFVEIEMSQLLKNYVYKRLSNISNDKIKRIFEYQNLKKLIENKLLNKLDTYVRNCGKYNYYLQDGEIATSDFIARNRQNEAFLRNIIGVSSVAYFSLRNILETENENDITGRMRGKTVKNNKGEEKYVSGEVDKIYNENKKNEVKENLKMFYSYDFNMDNKNEIEDFFANIDEAISSIRHGIVHFNLELEGKDIFAFKNIAPSEISKKMFQNEINEKKLKLKIFRQLNSANVFRYLEKYKILNYLKRTRFEFVNKNIPFVPSFTKLYSRIDDLKNSLGIYWKTPKTNDDNKTKEIIDAQIYLLKNIYYGEFLNYFMSNNGNFFEISKEIIELNKNDKRNLKTGFYKLQKFEDIQEKIPKEYLANIQSLYMINAGNQDEEEKDTYIDFIQKIFLKGFMTYLANNGRLSLIYIGSDEETNTSLAEKKQEFDKFLKKYEQNNNIKIPYEINEFLREIKLGNILKYTERLNMFYLILKLLNHKELTNLKGSLEKYQSANKEEAFSDQLELINLLNLDNNRVTEDFELEADEIGKFLDFNGNKVKDNKELKKFDTNKIYFDGENIIKHRAFYNIKKYGMLNLLEKIADKAGYKISIEELKKYSNKKNEIEKNHKMQENLHRKYARPRKDEKFTDEDYESYKQAIENIEEYTHLKNKVEFNELNLLQGLLLRILHRLVGYTSIWERDLRFRLKGEFPENQYIEEIFNFENKKNVKYKGGQIVEKYIKFYKELHQNDEVKINKYSSANIKVLKQEKKDLYIRNYIAHFNYIPHAEISLLEVLENLRKLLSYDRKLKNAVMKSVVDILKEYGFVATFKIGADKKIGIQTLESEKIVHLKNLKKKKLMTDRN SEELCKLVKIMFEYKMEEKKSEN C2-18Herbinix MKLTRRRISGNSVDQKITAAFYRDMSQGLLYYDSEDNDCTD hemicellulosilyticaKVIESMDFERSWRGRILKNGEDDKNPFYMFVKGLVGSNDKI (SEQ IDVCEPIDVDSDPDNLDILINKNLTGFGRNLKAPDSNDTLENLIR No. 209)KIQAGIPEEEVLPELKKIKEMIQKDIVNRKEQLLKSIKNNRIPFSLEGSKLVPSTKKMKWLFKLIDVPNKTFNEKMLEKYWEIYDYDKLKANITNRLDKTDKKARSISRAVSEELREYHKNLRTNYNRFVSGDRPAAGLDNGGSAKYNPDKEEFLLFLKEVEQYFKKYFPVKSKHSNKSKDKSLVDKYKNYCSYKVVKKEVNRSIINQLVAGLIQQGKLLYYFYYNDTWQEDFLNSYGLSYIQVEEAFKKSVMTSLSWGINRLTSFFIDDSNTVKFDDITTKKAKEAIESNYFNKLRTCSRMQDHFKEKLAFFYPVYVKDKKDRPDDDIENLIVLVKNAIESVSYLRNRTFHFKESSLLELLKELDDKNSGQNKIDYSVAAEFIKRDIENLYDVFREQIRSLGIAEYYKADMISDCFKTCGLEFALYSPKNSLMPAFKNVYKRGANLNKAYIRDKGPKETGDQGQNSYKALEEYRELTWYIEVKNNDQSYNAYKNLLQLIYYHAFLPEVRENEALITDFINRTKEWNRKETEERLNTKNNKKHKNFDENDDITVNTYRYESIPDYQGESLDDYLKVLQRKQMARAKEVNEKEEGNNNYIQFIRDVVVWAFGAYLENKLKNYKNELQPPLSKENIGLNDTLKELFPEEKVKSPFNIKCRFSISTFIDNKGKSTDNTSAEAVKTDGKEDEKDKKNIKRKDLLCFYLFLRLLDENEICKLQHQFIKYRCSLKERRFPGNRTKLEKETELLAELEELMELVRFTMPSIPEISAKAESGYDTMIKKYFKDFIEKKVFKNPKTSNLYYHSDSKTPVTRKYMALLMRSAPLHLYKDIFKGYYLITKKECLEYIKLSNIIKDYQNSLNELHEQLERIKLKSEKQNGKDSLYLDKKDFYKVKEYVENLEQVARYKHLQHKINFESLYRIFRIHVDIAARMVGYTQDWERDMHFLFKALVYNGVLEERRFEAIFNNNDDNNDGRIVKKIQNNLNNKNRELVSMLCWNKKLNKNEFGAIIWKRNPIAHLNHFTQTEQNSKSSLESLINSLRILLAYDRKRQNAVTKTINDLLLNDYHIRIKWEGRVDEGQIYFNIKEKEDIENEPIIHLKHLHKKDCYIYKNSYMFDKQKEWICNGIKEEVYDKSILKCIGNLFKFDYEDKNKSSANPKHT C2-19 [Eubacterium]MLRRDKEVKKLYNVFNQIQVGTKPKKWNNDEKLSPEENER rectaleRAQQKNIKMKNYKWREACSKYVESSQRIINDVIFYSYRKAK (SEQ IDNKLRYMRKNEDILKKMQEAEKLSKFSGGKLEDFVAYTLRKS No. 210)LVVSKYDTQEFDSLAAMVVFLECIGKNNISDHEREIVCKLLELIRKDFSKLDPNVKGSQGANIVRSVRNQNMIVQPQGDRFLFPQVYAKENETVTNKNVEKEGLNEFLLNYANLDDEKRAESLRKLRRILDVYFSAPNHYEKDMDITLSDNIEKEKFNVWEKHECGKKETGLFVDIPDVLMEAEAENIKLDAVVEKRERKVLNDRVRKQNIICYRYTRAVVEKYNSNEPLFFENNAINQYWIHHIENAVERILKNCKAGKLFKLRKGYLAEKVWKDAINLISIKYIALGKAVYNFALDDIWKDKKNKELGIVDERIRNGITSFDYEMIKAHENLQRELAVDIAFSVNNLARAVCDMSNLGNKESDFLLWKRNDIADKLKNKDDMASVSAVLQFFGGKSSWDINIFKDAYKGKKKYNYEVRFIDDLRKAIYCARNENFHFKTALVNDEKWNTELFGKIFERETEFCLNVEKDRFYSNNLYMFYQVSELRNMLDHLYSRSVSRAAQVPSYNSVIVRTAFPEYITNVLGYQKPSYDADTLGKWYSACYYLLKEIYYNSFLQSDRALQLFEKSVKTLSWDDKKQQRAVDNFKDHFSDIKSACTSLAQVCQIYMTEYNQQNNQIKKVRSSNDSIFDQPVYQHYKVLLKKAIANAFADYLKNNKDLFGFIGKPFKANEIREIDKEQFLPDWTSRKYEALCIEVSGSQELQKWYIVGKFLNARSLNLMVGSMRSYIQYVTDIKRRAASIGNELHVSVHDVEKVEKWVQVIEVCSLLASRTSNQFEDYFNDKDDYARYLKSYVDFSNVDMPSEYSALVDFSNEEQSDLYVDPKNPKVNRNIVHSKLFAADHILRDIVEPVSKDNIEEFYSQKAEIAYCKIKGKEITAEEQKAVLKYQKLKNRVELRDIVEYGEIINELLGQLINWSFMRERDLLYFQLGFHYDCLRNDSKKPEGYKNIKVDENSIKDAILYQIIGMYVNGVTVYAPEKDGDKLKEQCVKGGVGVKVSAFHRYSKYLGLNEKTLYNAGLEIFEVVAEHEDIINLRNGIDHFKYYLGDYRSMLSIYSEVFDRFFTYDIKYQKNVLNLLQNILLRHNVIVEPILESGFKTIGEQTKPGAKLSIRSIKSDTFQYKVKGGTLITDAKDERYLETIRKILYYAENEEDNLKKSVVVTNADKYEKNKESDDQNKQKEKKNKDNKGKKNEETKSDAEKN NNERLSYNPFANLNFKLSN C2-20Eubacteriaceae MKISKESHKRTAVAVMEDRVGGVVYVPGGSGIDLSNNLKK bacteriumRSMDTKSLYNVFNQIQAGTAPSEYEWKDYLSEAENKKREAQ CHKCI004KMIQKANYELRRECEDYAKKANLAVSRIIFSKKPKKIFSDDDI (SEQ IDISHMKKQRLSKFKGRMEDFVLIALRKSLVVSTYNQEVFDSR No. 211)KAATVFLKNIGKKNISADDERQIKQLMALIREDYDKWNPDKDSSDKKESSGTKVIRSIEHQNMVIQPEKNKLSLSKISNVGKKTKTKQKEKAGLDAFLKEYAQIDENSRMEYLKKLRRLLDTYFAAPSSYIKGAAVSLPENINFSSELNVWERHEAAKKVNINFVEIPESLLNAEQNNNKINKVEQEHSLEQLRTDIRRRNITCYHFANALAADERYHTLFFENMAMNQFWIHHMENAVERILKKCNVGTLFKLRIGYLSEKVWKDMLNLLSIKYIALGKAVYHFALDDIWKADIWKDASDKNSGKINDLTLKGISSFDYEMVKAQEDLQREMAVGVAFSTNNLARVTCKMDDLSDAESDFLLWNKEAIRRHVKYTEKGEILSAILQFFGGRSLWDESLFEKAYSDSNYELKFLDDLKRAIYAARNETFHFKTAAIDGGSWNTRLFGSLFEKEAGLCLNVEKNKFYSNNLVLFYKQEDLRVFLDKLYGKECSRAAQIPSYNTILPRKSFSDFMKQLLGLKEPVYGSAILDQWYSACYYLFKEVYYNLFLQDSSAKALFEKAVKALKGADKKQEKAVESFRKRYWEISKNASLAEICQSYITEYNQQNNKERKVRSANDGMFNEPIYQHYKMLLKEALKMAFASYIKNDKELKFVYKPTEKLFEVSQDNFLPNWNSEKYNTLISEVKNSPDLQKWYIVGKFMNARMLNLLLGSMRSYLQYVSDIQKRAAGLGENQLHLSAENVGQVKKWIQVLEVCLLLSVRISDKFTDYFKDEEEYASYLKEYVDFEDSAMPSDYSALLAFSNEGKIDLYVDASNPKVNRNIIQAKLYAPDMVLKKVVKKISQDECKEFNEKKEQIIVIQFKNKGDEVSWEEQQKILEYQKLKNRVELRDLSEYGELINELLGQLINWSYLRERDLLYFQLGFHYSCLMNESKKPDAYKTIRRGTVSIENAVLYQIIAMYINGFPVYAPEKGELKPQCKTGSAGQKIRAFCQWASMVEKKKYELYNAGLELFEVVKEHDNIIDLRNKIDHFKYYQGNDSILALYGEIFDRFFTYDMKYRNNVLNHLQNILLRHNVIIKPIISKDKKEVGRGKMKDRAAFLLEEVSSDRFTYKVKEGERKIDAKNRLYLETVRDILYFPNRAVNDKGEDVIICSKKAQDLNEKKADRDKNHDKSKDTNQKKEGKNQEEKSENKEPYSDRMTW KPFAGIKLE C2-21 Blautia sp.MKISKVDHVKSGIDQKLSSQRGMLYKQPQKKYEGKQLEEH Marseille-VRNLSRKAKALYQVFPVSGNSKMEKELQIINSFIKNILLRLDS P2398GKTSEEIVGYINTYSVASQISGDHIQELVDQHLKESLRKYTCV (SEQ IDGDKRIYVPDIIVALLKSKFNSETLQYDNSELKILIDFIREDYLK No. 212)EKQIKQIVHSIENNSTPLRIAEINGQKRLIPANVDNPKKSYIFEFLKEYAQSDPKGQESLLQHMRYLILLYLYGPDKITDDYCEEIEAWNFGSIVMDNEQLFSEEASMLIQDRIYVNQQIEEGRQSKDTAKVKKNKSKYRMLGDKIEHSINESVVKHYQEACKAVEEKDIPWIKYISDHVMSVYSSKNRVDLDKLSLPYLAKNTWNTWISFIAMKYVDMGKGVYHFAMSDVDKVGKQDNLIIGQIDPKFSDGISSFDYERIKAEDDLHRSMSGYIAFAVNNFARAICSDEFRKKNRKEDVLTVGLDEIPLYDNVKRKLLQYFGGASNWDDSIIDIIDDKDLVACIKENLYVARNVNFHFAGSEKVQKKQDDILEEIVRKETRDIGKHYRKVFYSNNVAVFYCDEDIIKLMNHLYQREKPYQAQIPSYNKVISKTYLPDLIFMLLKGKNRTKISDPSIMNMFRGTFYFLLKEIYYNDFLQASNLKEMFCEGLKNNVKNKKSEKPYQNFMRRFEELENMGMDFGEICQQIMTDYEQQNKQKKKTATAVMSEKDKKIRTLDNDTQKYKHFRTLLYIGLREAFIIYLKDEKNKEWYEFLREPVKREQPEEKEFVNKWKLNQYSDCSELILKDSLAAAWYVVAHFINQAQLNHLIGDIKNYIQFISDIDRRAKSTGNPVSESTEIQIERYRKILRVLEFAKFFCGQITNVLTDYYQDENDFSTHVGHYVKFEKKNMEPAHALQAFSNSLYACGKEKKKAGFYYDGMNPIVNRNITLASMYGNKKLLENAMNPVTEQDIRKYYSLMAELDSVLKNGAVCKSEDEQKNLRHFQNLKNRIELVDVLTLSELVNDLVAQLIGWVYIRERDMMYLQLGLHYIKLYFTDSVAEDSYLRTLDLEEGSIADGAVLYQIASLYSFNLPMYVKPNKSSVYCKKHVNSVATKFDIFEKEYCNGDETVIENGLRLFENINLHKDMVKFRDYLAHFKYFAKLDESILELYSKAYDFFFSYNIKLKKSVSYVLTNVLLSYFINAKLSFSTYKSSGNKTVQHRTTKISVVAQTDYFTYKLRSIVKNKNGVESIENDDRRCEV VNIAARDKEFVDEVCNVINYNSDKC2-22 Leptotrichia MGNLFGHKRWYEVRDKKDFKIKRKVKVKRNYDGNKYILNIsp. oral taxon NENNNKEKIDNNKFIGEFVNYKKNNNVLKEFKRKFHAGNIL 879 str. F0557FKLKGKEEIIRIENNDDFLETEEVVLYIEVYGKSEKLKALEITK (SEQ IDKKIIDEAIRQGITKDDKKIEIKRQENEEEIEIDIRDEYTNKTLND No. 213)CSIILRIIENDELETKKSIYEIFKNINMSLYKIIEKIIENETEKVFENRYYEEHLREKLLKDNKIDVILTNFMEIREKIKSNLEIMGFVKFYLNVSGDKKKSENKKMFVEKILNTNVDLTVEDIVDFIVKELKFWNITKRIEKVKKFNNEFLENRRNRTYIKSYVLLDKHEKFKIERENKKDKIVKFFVENIKNNSIKEKIEKILAEFKINELIKKLEKELKKGNCDTEIFGIFKKHYKVNFDSKKFSNKSDEEKELYKIIYRYLKGRIEKILVNEQKVRLKKMEKIEIEKILNESILSEKILKRVKQYTLEHIMYLGKLRHNDIVKMTVNTDDFSRLHAKEELDLELITFFASTNMELNKIFNGKEKVTDFFGFNLNGQKITLKEKVPSFKLNILKKLNFINNENNIDEKLSHFYSFQKEGYLLRNKILHNSYGNIQETKNLKGEYENVEKLIKELKVSDEEISKSLSLDVIFEGKVDIINKINSLKIGEYKDKKYLPSFSKIVLEITRKFREINKDKLFDIESEKIILNAVKYVNKILYEKITSNEENEFLKTLPDKLVKKSNNKKENKNLLSIEEYYKNAQVSSSKGDKKAIKKYQNKVTNAYLEYLENTFTEIIDFSKFNLNYDEIKTKIEERKDNKSKIIIDSISTNINITNDIEYIISIFALLNSNTYINKIRNRFFATSVWLEKQNGTKEYDYENIISILDEVLLINLLRENNITDILDLKNAIIDAKIVENDETYIKNYIFESNEEKLKKRLFCEELVDKEDIRKIFEDENFKFKSFIKKNEIGNFKINFGILSNLECNSEVEAKKIIGKNSKKLESFIQNIIDEYKSNIRTLFSSEFLEKYKEEIDNLVEDTESENKNKFEKIYYPKEHKNELYIYKKNLFLNIGNPNFDKIYGLISKDIKNVDTKILFDDDIKKNKISEIDAILKNLNDKLNGYSNDYKAKYVNKLKENDDFFAKNIQNENYSSFGEFEKDYNKVSEYKKIRDLVEFNYLNKIESYLIDINWKLAIQMARFERDMHYIVNGLRELGIIKLSGYNTGISRAYPKRNGSDGFYTTTAYYKFFDEESYKKFEKICYGFGIDLSENSEINKPENESIRNYISHFYIVRNPFADYSIAEQIDRVSNLLSYSTRYNNSTYASVFEVFKKDVNLDYDELKKKFRLIGNNDILERLMKPKKVSVLELESYNSDYIKNLIIELLTKIENTNDTL C2-23 LachnospiraceaeMKISKVDHTRMAVAKGNQHRRDEISGILYKDPTKTGSIDFDE bacteriumRFKKLNCSAKILYHVFNGIAEGSNKYKNIVDKVNNNLDRVL NK4A144FTGKSYDRKSIIDIDTVLRNVEKINAFDRISTEEREQIIDDLLEI (SEQ IDQLRKGLRKGKAGLREVLLIGAGVIVRTDKKQEIADFLEILDE No. 214)DFNKTNQAKNIKLSIENQGLVVSPVSRGEERIFDVSGAQKGKSSKKAQEKEALSAFLLDYADLDKNVRFEYLRKIRRLINLYFYVKNDDVMSLTEIPAEVNLEKDFDIWRDHEQRKEENGDFVGCPDILLADRDVKKSNSKQVKIAERQLRESIREKNIKRYRFSIKTIEKDDGTYFFANKQISVFWIHRIENAVERILGSINDKKLYRLRLGYLGEKVWKDILNFLSIKYIAVGKAVFNFAMDDLQEKDRDIEPGKISENAVNGLTSFDYEQIKADEMLQREVAVNVAFAANNLARVTVDIPQNGEKEDILLWNKSDIKKYKKNSKKGILKSILQFFGGASTWNMKMFEIAYHDQPGDYEENYLYDIIQIIYSLRNKSFHFKTYDHGDKNWNRELIGKMIEHDAERVISVEREKFHSNNLPMFYKDADLKKILDLLYSDYAGRASQVPAFNTVLVRKNFPEFLRKDMGYKVHFNNPEVENQWHSAVYYLYKEIYYNLFLRDKEVKNLFYTSLKNIRSEVSDKKQKLASDDFASRCEEIEDRSLPEICQIIMTEYNAQNFGNRKVKSQRVIEKNKDIFRHYKMLLIKTLAGAFSLYLKQERFAFIGKATPIPYETTDVKNFLPEWKSGMYASFVEEIKNNLDLQEWYIVGRFLNGRMLNQLAGSLRSYIQYAEDIERRAAENRNKLFSKPDEKIEACKKAVRVLDLCIKISTRISAEFTDYFDSEDDYADYLEKYLKYQDDAIKELSGSSYAALDHFCNKDDLKFDIYVNAGQKPILQRNIVMAKLFGPDNILSEVMEKVTESAIREYYDYLKKVSGYRVRGKCSTEKEQEDLLKFQRLKNAVEFRDVTEYAEVINELLGQLISWSYLRERDLLYFQLGFHYMCLKNKSFKPAEYVDIRRNNGTIIHNAILYQIVSMYINGLDFYSCDKEGKTLKPIETGKGVGSKIGQFIKYSQYLYNDPSYKLEIYNAGLEVFENIDEHDNITDLRKYVDHFKYYAYGNKMSLLDLYSEFFDRFFTYDMKYQKNVVNVLENILLRHFVIFYPKFGSGKKDVGIRDCKKERAQIEISEQSLTSEDFMFKLDDKAGEEAKKFPARDERYLQTIAKLLYYPNEIEDMNRFMKKGETINKKVQFNRKKKITRKQKNNSSNEVLSSTMGYLFKNIKL C2-24 ChloroflexusMTDQVRREEVAAGELADTPLAAAQTPAADAAVAATPAPAE aggregansAVAPTPEQAVDQPATTGESEAPVTTAQAAAHEAEPAEATGA (SEQ IDSFTPVSEQQPQKPRRLKDLQPGMELEGKVTSIALYGIFVDVG No. 215)VGRDGLVHISEMSDRRIDTPSELVQIGDTVKVWVKSVDLDARRISLTMLNPSRGEKPRRSRQSQPAQPQPRRQEVDREKLASLKVGEIVEGVITGFAPFGAFADIGVGKDGLIHISELSEGRVEKPEDAVKVGERYQFKVLEIDGEGTRISLSLRRAQRTQRMQQLEPGQIIEGTVSGIATFGAFVDIGVGRDGLVHISALAPHRVAKVEDVVKVGDKVKVKVLGVDPQSKRISLTMRLEEEQPATTAGDEAAEPAEEVTPTRRGNLERFAAAAQTARERSERGERSERGERRERRERRPAQSSPDTYIVGEDDDESFEGNATIEDLLTKFGGSSSRRDRDRRRRHEDDDDEEMERPSNRRQREAIRRTLQQIGYDE C2-25 DemequinaMDLTWHALLILFIVALLAGFLDTLAGGGGLLTVPALLLTGIP aurantiacaPLQALGTNKLQSSFGTGMATYQVIRKKRVHWRDVRWPMV (SEQ IDWAFLGSAAGAVAVQFIDTDALLIIIPVVLALVAAYFLFVPKS No. 216)HLPPPEPRMSDPAYEATLVPIIGAYDGAFGPGTGSLYALSGVALRAKTLVQSTAIAKTLNFATNFAALLVFAFAGHMLWTVGAVMIAGQLIGAYAGSHMLFRVNPLVLRVLIVVMSLGMLIRVL LD C2-26 ThalassospiraMRIIKPYGRSHVEGVATQEPRRKLRLNSSPDISRDIPGFAQSH sp.DALIIAQWISAIDKIATKPKPDKKPTQAQINLRTTLGDAAWQ TSL5-1HVMAENLLPAATDPAIREKLHLIWQSKIAPWGTARPQAEKD (SEQ IDGKPTPKGGWYERFCGVLSPEAITQNVARQIAKDIYDHLHVA No. 217)AKRKGREPAKQGESSNKPGKFKPDRKRGLIEERAESIAKNALRPGSHAPCPWGPDDQATYEQAGDVAGQIYAAARDCLEEKKRRSGNRNTSSVQYLPRDLAAKILYAQYGRVFGPDTTIKAALDEQPSLFALHKAIKDCYHRLINDARKRDILRILPRNMAALFRLVRAQYDNRDINALIRLGKVIHYHASEQGKSEHHGIRDYWPSQQDIQNSRFWGSDGQADIKRHEAFSRIWRHIIALASRTLHDWADPHSQKFSGENDDILLLAKDAIEDDVFKAGHYERKCDVLFGAQASLFCGAEDFEKAILKQAITGTGNLRNATFHFKGKVRFEKELQELTKDVPVEVQSAIAALWQKDAEGRTRQIAETLQAVLAGHFLTEEQNRHIFAALTAAMAQPGDVPLPRLRRVLARHDSICQRGRILPLSPCPDRAKLEESPALTCQYTVLKMLYDGPFRAWLAQQNSTILNHYIDSTIARTDKAARDMNGRKLAQAEKDLITSRAADLPRLSVDEKMGDFLARLTAATATEMRVQRGYQSDGENAQKQAAFIGQFECDVIGRAFADFLNQSGFDFVLKLKADTPQPDAAQCDVTALIAPDDISVSPPQAWQQVLYFILHLVPVDDASHLLHQIRKWQVLEGKEKPAQIAHDVQSVLMLYLDMHDAKFTGGAALHGIEKFAEFFAHAADFRAVFPPQSLQDQDRSIPRRGLREIVRFGHLPLLQHMSGTVQITHDNVVAWQAARTAGATGMSPIARRQKQREELHALAVERTARFRNADLQNYMHALVDVIKHRQLSAQVTLSDQVRLHRLMMGVLGRLVDYAGLWERDLYFVVLALLYHHGATPDDVFKGQGKKNLADGQVVAALKPKNRKAAAPVGVFDDLDHYGIYQDDRQSIRNGLSHFNMLRGGKAPDLSHWVNQTRSLVAHDRKLKNAVAKSVIEMLAREGFDLDWGIQTDRGQHILSHGKIRTRQAQHFQKSRLHIVKKSAKPDKNDTVKIRENLHGDAMVERVVQLFAAQVQKRYDITVEKRLDHLFLKPQDQKGKNGIHTHNGWSKTEKKRRPSRENRKGNH EN C2-27 SAMN04487830_13920MKFSKESHRKTAVGVTESNGIIGLLYKDPLNEKEKIEDVVNQ [PseudobutyrivibrioRANSTKRLFNLFGTEATSKDISRASKDLAKVVNKAIGNLKGN sp. OR37]KKFNKKEQITKGLNTKIIVEELKNVLKDEKKLIVNKDIIDEAC (SEQ IDSRLLKTSFRTAKTKQAVKMILTAVLIENTNLSKEDEAFVHEY No. 218)FVKKLVNEYNKTSVKKQIPVALSNQNMVIQPNSVNGTLEISETKKSKETKTTEKDAFRAFLRDYATLDENRRHKMRLCLRNLVNLYFYGETSVSKDDFDEWRDHEDKKQNDELFVKKIVSIKTDRKGNVKEVLDVDATIDAIRTNNIACYRRALAYANENPDVFFSDTMLNKFWIHHVENEVERIYGHINNNTGDYKYQLGYLSEKVWKGIINYLSIKYIAEGKAVYNYAMNALAKDNNSNAFGKLDEKFVNGITSFEYERIKAEETLQRECAVNIAFAANHLANATVDLNEKDSDFLLLKHEDNKDTLGAVARPNILRNILQFFGGKSRWNDFDFSGIDEIQLLDDLRKMIYSLRNSSFHFKTENIDNDSWNTKLIGDMFAYDFNMAGNVQKDKMYSNNVPMFYSTSDIEKMLDRLYAEVHERASQVPSFNSVFVRKNFPDYLKNDLKITSAFGVDDALKWQSAVYYVCKEIYYNDFLQNPETFTMLKDYVQCLPIDIDKSMDQKLKSERNAHKNFKEAFATYCKECDSLSAICQMIMTEYNNQNKGNRKVISARTKDGDKLIYKHYKMILFEALKNVFTIYLEKNINTYGFLKKPKLINNVPAIEEFLPNYNGRQYETLVNRITEETELQKWYIVGRLLNPKQVNQLIGNFRSYVQYVNDVARRAKQTGNNLSNDNIAWDVKNIIQIFDVCTKLNGVTSNILEDYFDDGDDYARYLKNFVDYTNKNNDHSATLLGDFCAKEIDGIKIGIYHDGTNPIVNRNIIQCKLYGATGIISDLTKDGSILSVDYEIIKKYMQMQKEIKVYQQKGICKTKEEQQNLKKYQELKNIVELRNIIDYSEILDELQGQLINWGYLRERDLMYFQLGFHYLCLHNESKKPVGYNNAGDISGAVLYQIVAMYTNGLSLIDANGKSKKNAKASAGAKVGSFCSYSKEIRGVDKDTKEDDDPIYLAGVELFENINEHQQCINLRNYIEHFHYYAKHDRSMLDLYSEVFDRFFTYDMKYTKNVPNMMYNILLQHLVVPAFEFGSSEKRLDDNDEQTKPRAMFTLREKNGLSSEQFTYRLGDGNSTVKLSARGDDYLRAVASLLYYPDRAPEGLIRDAEAEDKFAKINHSNPKSD NRNNRGNFKNPKVQWYNNKTKRK C2-28SAMN02910398_00008 MKISKVDHRKTAVKITDNKGAEGFIYQDPTRDSSTMEQIISN[Butyrivibrio sp. RARSSKVLFNIFGDTKKSKDLNKYTESLIIYVNKAIKSLKGDK YAB3001]RNNKYEEITESLKTERVLNALIQAGNEFTCSENNIEDALNKY (SEQ IDLKKSFRVGNTKSALKKLLMAAYCGYKLSIEEKEEIQNYFVD No. 219)KLVKEYNKDTVLKYTAKSLKHQNMVVQPDTDNHVFLPSRIAGATQNKMSEKEALTEFLKAYAVLDEEKRHNLRIILRKLVNLYFYESPDFIYPENNEWKEHDDRKNKTETFVSPVKVNEEKNGKTFVKIDVPATKDLIRLKNIECYRRSVAETAGNPITYFTDHNISKFWIHHIENEVEKIFALLKSNWKDYQFSVGYISEKVWKEIINYLSIKYIAIGKAVYNYALEDIKKNDGTLNFGVIDPSFYDGINSFEYEKIKAEETFQREVAVYVSFAVNHLSSATVKLSEAQSDMLVLNKNDIEKIAYGNTKRNILQFFGGQSKWKEFDFDRYINPVNYTDIDFLFDIKKMVYSLRNESFHFTTTDTESDWNKNLISAMFEYECRRISTVQKNKFFSNNLPLFYGENSLERVLHKLYDDYVDRMSQVPSFGNVFVRKKFPDYMKEIGIKHNLSSEDNLKLQGALYFLYKEIYYNAFISSEKAMKIFVDLVNKLDTNARDDKGRITHEAMAHKNFKDAISHYMTHDCSLADICQKIMTEYNQQNTGHRKKQTTYSSEKNPEIFRHYKMILFMLLQKAMTEYISSEEIFDFIMKPNSPKTDIKEEEFLPQYKSCAYDNLIKLIADNVELQKWYITARLLSPREVNQLIGSFRSYKQFVSDIERRAKETNNSLSKSGMTVDVENITKVLDLCTKLNGRFSNELTDYFDSKDDYAVYVSKFLDFGFKIDEKFPAALLGEFCNKEENGKKIGIYHNGTEPILNSNIIKSKLYGITDVVSRAVKPVSEKLIREYLQQEVKIKPYLENGVCKNKEEQAALRKYQELKNRIEFRDIVEYSEIINELMGQLINFSYLRERDLMYFQLGFHYLCLNNYGAKPEGYYSIVNDKRTIKGAILYQIVAMYTYGLPIYHYVDGTISDRRKNKKTVLDTLNSSETVGAKIKYFIYYSDELFNDSLILYNAGLELFENINEHENIVNLRKYIDHFKYYVSQDRSLLDIYSEVFDRYFTYDRKYKKNVMNLFSNIMLKHFIITDFEFSTGEKTIGEKNTAKKECAKVRIKRGGLSSDKFTYKFKDAKPIELSAKNTEFLDGVARILYYPENVVLTDLVRNSEVEDEKRIEKYDRNHNSSPTRKDKTYKQDVKKNYNKKTSKAFDSSKLDTKSVGNNLSDNPVLKQFLSESKKK R C2-29 Blautia sp.MKISKVDHVKSGIDQKLSSQRGMLYKQPQKKYEGKQLEEH Marseille-VRNLSRKAKALYQVFPVSGNSKMEKELQIINSFIKNILLRLDS P2398GKTSEEIVGYINTYSVASQISGDHIQELVDQHLKESLRKYTCV (SEQ IDGDKRIYVPDIIVALLKSKFNSETLQYDNSELKILIDFIREDYLK No. 220)EKQIKQIVHSIENNSTPLRIAEINGQKRLIPANVDNPKKSYIFEFLKEYAQSDPKGQESLLQHMRYLILLYLYGPDKITDDYCEEIEAWNFGSIVMDNEQLFSEEASMLIQDRIYVNQQIEEGRQSKDTAKVKKNKSKYRMLGDKIEHSINESVVKHYQEACKAVEEKDIPWIKYISDHVMSVYSSKNRVDLDKLSLPYLAKNTWNTWISFIAMKYVDMGKGVYHFAMSDVDKVGKQDNLIIGQIDPKFSDGISSFDYERIKAEDDLHRSMSGYIAFAVNNFARAICSDEFRKKNRKEDVLTVGLDEIPLYDNVKRKLLQYFGGASNWDDSIIDIIDDKDLVACIKENLYVARNVNFHFAGSEKVQKKQDDILEEIVRKETRDIGKHYRKVFYSNNVAVFYCDEDIIKLMNHLYQREKPYQAQIPSYNKVISKTYLPDLIFMLLKGKNRTKISDPSIMNMFRGTFYFLLKEIYYNDFLQASNLKEMFCEGLKNNVKNKKSEKPYQNFMRRFEELENMGMDFGEICQQIMTDYEQQNKQKKKTATAVMSEKDKKIRTLDNDTQKYKHFRTLLYIGLREAFIIYLKDEKNKEWYEFLREPVKREQPEEKEFVNKWKLNQYSDCSELILKDSLAAAWYVVAHFINQAQLNHLIGDIKNYIQFISDIDRRAKSTGNPVSESTEIQIERYRKILRVLEFAKFFCGQITNVLTDYYQDENDFSTHVGHYVKFEKKNMEPAHALQAFSNSLYACGKEKKKAGFYYDGMNPIVNRNITLASMYGNKKLLENAMNPVTEQDIRKYYSLMAELDSVLKNGAVCKSEDEQKNLRHFQNLKNRIELVDVLTLSELVNDLVAQLIGWVYIRERDMMYLQLGLHYIKLYFTDSVAEDSYLRTLDLEEGSIADGAVLYQIASLYSFNLPMYVKPNKSSVYCKKHVNSVATKFDIFEKEYCNGDETVIENGLRLFENINLHKDMVKFRDYLAHFKYFAKLDESILELYSKAYDFFFSYNIKLKKSVSYVLTNVLLSYFINAKLSFSTYKSSGNKTVQHRTTKISVVAQTDYFTYKLRSIVKNKNGVESIENDDRRCEV VNIAARDKEFVDEVCNVINYNSDKC2-30 Leptotrichia MKITKIDGISHKKYIKEGKLVKSTSEENKTDERLSELLTIRLD sp.TYIKNPDNASEEENRIRRENLKEFFSNKVLYLKDGILYLKDR Marseille-REKNQLQNKNYSEEDISEYDLKNKNNFLVLKKILLNEDINSE P3007ELEIFRNDFEKKLDKINSLKYSLEENKANYQKINENNIKKVE (SEQ IDGKSKRNIFYNYYKDSAKRNDYINNIQEAFDKLYKKEDIENLF No. 221)FLIENSKKHEKYKIRECYHKIIGRKNDKENFATIIYEEIQNVNNMKELIEKVPNVSELKKSQVFYKYYLNKEKLNDENIKYVFCHFVEIEMSKLLKNYVYKKPSNISNDKVKRIFEYQSLKKLIENKLLNKLDTYVRNCGKYSFYLQDGEIATSDFIVGNRQNEAFLRNIIGVSSTAYFSLRNILETENENDITGRMRGKTVKNNKGEEKYISGEIDKLYDNNKQNEVKKNLKMFYSYDFNMNSKKEIEDFFSNIDEAISSIRHGIVHFNLELEGKDIFTFKNIVPSQISKKMFHDEINEKKLKLKIFKQLNSANVFRYLEKYKILNYLNRTRFEFVNKNIPFVPSFTKLYSRIDDLKNSLGIYWKTPKTNDDNKTKEITDAQIYLLKNIYYGEFLNYFMSNNGNFFEITKEIIELNKNDKRNLKTGFYKLQKFENLQEKTPKEYLANIQSLYMINAGNQDEEEKDTYIDFIQKIFLKGFMTYLANNGRLSLIYIGSDEETNTSLAEKKQEFDKFLKKYEQNNNIEIPYEINEFVREIKLGKILKYTERLNMFYLILKLLNHKELTNLKGSLEKYQSANKEEAFSDQLELINLLNLDNNRVTEDFELEADEIGKFLDFNGNKVKDNKELKKFDTNKIYFDGENIIKHRAFYNIKKYGMLNLLEKISDEAKYKISIEELKNYSKKKNEIEENHTTQENLHRKYARPRKDEKFTDEDYKKYEKAIRNIQQYTHLKNKVEFNELNLLQSLLLRILHRLVGYTSIWERDLRFRLKGEFPENQYIEEIFNFDNSKNVKYKNGQIVEKYINFYKELYKDDTEKISIYSDKKVKELKKEKKDLYIRNYIAHFNYIPNAEISLLEMLENLRKLLSYDRKLKNAIMKSIVDILKEYGFVVTFKIEKDKKIRIESLKSEEVVHLKKLKLKDNDKKKEPIKTYRNS KELCKLVKVMFEYKMKEKKSEN C2-31Bacteroides MRITKVKVKESSDQKDKMVLIHRKVGEGTLVLDENLADLTA ihuae (SEQPIIDKYKDKSFELSLLKQTLVSEKEMNIPKCDKCTAKERCLSC ID No. 222)KQREKRLKEVRGAIEKTIGAVIAGRDIIPRLNIFNEDEICWLIKPKLRNEFTFKDVNKQVVKLNLPKVLVEYSKKNDPTLFLAYQQWIAAYLKNKKGHIKKSILNNRVVIDYSDESKLSKRKQALELWGEEYETNQRIALESYHTSYNIGELVTLLPNPEEYVSDKGEIRPAFHYKLKNVLQMHQSTVFGTNEILCINPIFNENRANIQLSAYNLEVVKYFEHYFPIKKKKKNLSLNQAIYYLKVETLKERLSLQLENALRMNLLQKGKIKKHEFDKNTCSNTLSQIKRDEFFVLNLVEMCAFAANNIRNIVDKEQVNEILSKKDLCNSLSKNTIDKELCTKFYGADFSQIPVAIWAMRGSVQQIRNEIVHYKAEAIDKIFALKTFEYDDMEKDYSDTPFKQYLELSIEKIDSFFIEQLSSNDVLNYYCTEDVNKLLNKCKLSLRRTSIPFAPGFKTIYELGCHLQDSSNTYRIGHYLMLIGGRVANSTVTKASKAYPAYRFMLKLIYNHLFLNKFLDNHNKRFFMKAVAFVLKDNRENARNKFQYAFKEIRMMNNDESIASYMSYIHSLSVQEQEKKGDKNDKVRYNTEKFIEKVFVKGFDDFLSWLGVEFILSPNQEERDKTVTREEYENLMIKDRVEHSINSNQESHIAFFTFCKLLDANHLSDLRNEWIKFRSSGDKEGFSYNFAIDIIELCLLTVDRVEQRRDGYKEQTELKEYLSFFIKGNESENTVWKGFYFQQDNYTPVLYSPIELIRKYGTLELLKLIIVDEDKITQGEFEEWQTLKKVVEDKVTRRNELHQEWEDMKNKSSFSQEKCSIYQKLCRDIDRYNWLDNKLHLVHLRKLHNLVIQILSRMARFIALWDRDFVLLDASRANDDYKLLSFFNFRDFINAKKTKTDDELLAEFGSKIEKKNAPFIKAEDVPLMVECIEAKRSFYQKVFFRNNLQVLADRNFIAHYNYISKTAKCSLFEMIIKLRTLMYYDRKLRNAVVKSIANVFDQNGMVLQLSLDDSHELKVDKVISKRIVHLKNNNIMTDQVPEEYYKICRRL LEMKK C2-32 SAMN05216357_1045MEFRDSIFKSLLQKEIEKAPLCFAEKLISGGVFSYYPSERLKEF [PorphyromonadaceaeVGNHPFSLFRKTMPFSPGFKRVMKSGGNYQNANRDGRFYD bacteriumLDIGVYLPKDGFGDEEWNARYFLMKLIYNQLFLPYFADAEN KH3CP3RA]HLFRECVDFVKRVNRDYNCKNNNSEEQAFIDIRSMREDESIA (SEQ IDDYLAFIQSNIIIEENKKKETNKEGQINFNKFLLQVFVKGFDSFL No. 223)KDRTELNFLQLPELQGDGTRGDDLESLDKLGAVVAVDLKLDATGIDADLNENISFYTFCKLLDSNHLSRLRNEIIKYQSANSDFSHNEDFDYDRIISIIELCMLSADHVSTNDNESIFPNNDKDFSGIRPYLSTDAKVETFEDLYVHSDAKTPITNATMVLNWKYGTDKLFERLMISDQDFLVTEKDYFVWKELKKDIEEKIKLREELHSLWVNTPKGKKGAKKKNGRETTGEFSEENKKEYLEVCREIDRYVNLDNKLHFVHLKRMHSLLIELLGRFVGFTYLFERDYQYYHLEIRSRRNKDAGVVDKLEYNKIKDQNKYDKDDFFACTFLYEKANKVRNFIAHFNYLTMWNSPQEEEHNSNLSGAKNSSGRQNLKCSLTELINELREVMSYDRKLKNAVTKAVIDLFDKHGMVIKFRIVNNNNNDNKNKHHLELDDIVPKKIMHLRGIKLKRQDG KPIPIQTDSVDPLYCRMWKKLLDLKPTPFC2-33 Listeria MHDAWAENPKKPQSDAFLKEYKACCEAIDTYNWHKNKAT ripariaLVYVNELHHLLIDILGRLVGYVAIADRDFQCMANQYLKSSG (SEQ IDHTERVDSWINTIRKNRPDYIEKLDIFMNKAGLFVSEKNGRNY No. 224)IAHLNYLSPKHKYSLLYLFEKLREMLKYDRKLKNAVTKSLIDLLDKHGMCVVFANLKNNKHRLVIASLKPKKIETFKWKKIK C2-34 InsolitispinillumMRIIRPYGSSTVASPSPQDAQPLRSLQRQNGTFDVAEFSRRHP peregrinumELVLAQWVAMLDKIIRKPAPGKNSTALPRPTAEQRRLRQQV (SEQ IDGAALWAEMQRHTPVPPELKAVWDSKVHPYSKDNAPATAKT No. 225)PSHRGRWYDRFGDPETSAATVAEGVRRHLLDSAQPFRANGGQPKGKGVIEHRALTIQNGTLLHHHQSEKAGPLPEDWSTYRADELVSTIGKDARWIKVAASLYQHYGRIFGPTTPISEAQTRPEFVLHTAVKAYYRRLFKERKLPAERLERLLPRTGEALRHAVTVQHGNRSLADAVRIGKILHYGWLQNGEPDPWPDDAALYSSRYWGSDGQTDIKHSEAVSRVWRRALTAAQRTLTSWLYPAGTDAGDILLIGQKPDSIDRNRLPLLYGDSTRHWTRSPGDVWLFLKQTLENLRNSSFHFKTLSAFTSHLDGTCESEPAEQQAAQALWQDDRQQDHQQVFLSLRALDATTYLPTGPLHRIVNAVQSTDATLPLPRFRRVVTRAANTRLKGFPVEPVNRRTMEDDPLLRCRYGVLKLLYERGFRAWLETRPSIASCLDQSLKRSTKAAQTINGKNSPQGVEILSRATKLLQAEGGGGHGIHDLFDRLYAATAREMRVQVGYHHDAEAARQQAEFIEDLKCEVVARAFCAYLKTLGIQGDTFRRQPEPLPTWPDLPDLPSSTIGTAQAALYSVLHLMPVEDVGSLLHQLRRWLVALQARGGEDGTAITATIPLLELYLNRHDAKFSGGGAGTGLRWDDWQVFEDCQATFDRVFPPGPALDSHRLPLRGLREVLRFGRVNDLAALIGQDKITAAEVDRWHTAEQTIAAQQQRREALHEQLSRKKGTDAEVDEYRALVTAIADHRHLTAHVTLSNVVRLHRLMTTVLGRLVDYGGLWERDLTFVTLYEAHRLGGLRNLLSESRVNKFLDGQTPAALSKKNNAEENGMISKVLGDKARRQIRNDFAHFNMLQQGKKTINLTDEINNARKLMAHDRKLKNAITRSVTTLLQQDGLDIVWTMDASHRLTDAKIDSRNAIHLHKTHNRANIREPLHGKSYCRWVAALFGA TSTPSATKKSDKIR

In certain example embodiments, the RNA-targeting effector protein is aCas13c effector protein as disclosed in U.S. Provisional PatentApplication No. 62/525,165 filed Jun. 26, 2017, and PCT Application No.US 2017/047193 filed Aug. 16, 2017. Example wildtype orthologuesequences of Cas13c are provided in Table 4 below. In certain exampleembodiments, the CRISPR effector protein is a Cas13c protein from Table3 or 4.

TABLE 3 Fusobacterium MEKFRRQNRNSIIKIIISNYDTKGIKELKVRYRKQAQLDTFIIKTEInecrophorum VNNDIFIKSIIEKAREKYRYSFLFDGEEKYHFKNKSSVEIVKKDIF subsp.SQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSA funduliformeRREKSMTERKLIEEKVAKNYSLLANCPMEEVDSIKIYKIKRFLT ATCC 51357YRSNMLLYFASINSFLCEGIKGKDNETEEIWHLKDNDVRKEKV contig00003RENFKNKLIQSTENYNSSLKNQIEEKEKLLRKEFKKGAFYRTIIK (SEQ ID No.KLQQERIKELSEKSLTEDCEKIIKLYSKLRHSLMHYDYQYFENLF 226)ENKKNDDLMKDLNLDLFKSLPLIRKMKLNNKVNYLEDGDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLEKRISESEKKNEKLKKKLDSMKAHFRNINSEDTKEAYFWDIHSSRNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITKINRQLLKLKQEMEEITKKNSLFRLEYKMKIAFGFLFCEFDGNISKFKDEFDASNQEKIIQYHKNGEKYLTSFLKEEEKEKFNLEKMQKIIQKTEEEDWLLPETKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFIDENQNNIQVSQTVEKQEDYFYHKIRLFEKNTKKYEIVKYSIVPNEKLKQYFEDLGIDIKYLTVEQKSEVSEEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSGKPLEIFRKELESKMKDGYLNFGQLLYVVYEVLVKNKDLDKILSKKIDYRKDKSFSPEIAYLRNFLSHLNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIQFIEKCNLQNQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKMNEKEEILRNRYHLTDKKNEQIKDEHEAQSQLYEKILSLQKIYSSDKNNFYGRLKEEKLLFLEKQGKKKLSMEEIKDKIAGDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNLSFYNHQDKKKEESIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLIIEKNTLKISSNGKIISLIPHYSYSIDVKY FusobacteriumMEKFRRQNRSSIIKIIISNYDTKGIKELKVRYRKQAQLDTFIIKTEI necrophorumVNNDIFIKSBEKAREKYRYSFLFDGEEKYHFKNKSSVEIVKKDIF DJ-2SQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSA contig0065,RREKSMTERKLIEEKVAENYSLLANCPMEEVDSIKIYKIKRFLTY whole genomeRSNMLLYFASINSFLCEGIKGKDNETEEIWHLKDNDVRKEKVKE shotgunNFKNKLIQSTENYNSSLKNQIEEKEKLLRKESKKGAFYRTIIKKL sequence (SEQQQERIKELSEKSLTEDCEKIIKLYSELRHPLMHYDYQYFENLFEN ID No. 227)KENSELTKNLNLDIFKSLPLVRKMKLNNKVNYLEDNDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEIEFLEKRISESEKKNEKLKKKLDSMKAHFRNINSEDTKEAYFWDIHSSRNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITKINRQLLKLKQEMEEITKKNSLFRLEYKMKMAFGFLFCEFDGNISRFKDEFDASNQEKIIQYHKNGEKYLTYFLKEEEKEKFNLKKLQETIQKTGEENWLLPQNKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFMDENQSSKIIESKEDDFYHKIRLFEKNTKKYEIVKYSIVPDKKLKQYFKDLGIDTKYLILDQKSEVSGEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSGKPFEVFLKELKDKMIGKQLNFGQLLYVVYEVLVKNKDLSEILSERIDYRKDMCFSAEIADLRNFLSHLNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIKFIEECNLQSQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKMNEKEEILRNRYHLTDKKNEQIKDEHEAQSQLYEKILSLQKIYSSDKNNFYGRLKEEKLLFLEKQEKKKLSMEEIKDKIAGDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNLSFYNHQDKKKEESIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLIIEKNTLKISSNGKIISLIPHYSYSIDVKY FusobacteriumMKVRYRKQAQLDTFIIKTEIVNNDIFIKSIIEKAREKYRYSFLFDG necrophorumEEKYHFKNKSSVEIVKNDIFSQTPDNMIRNYKITLKISEKNPRVV BFTR-1EAEIEDLMNSTILKDGRRSARREKSMTERKLIEEKVAENYSLLA contig0068NCPIEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGKDNE (SEQ ID No.TEEIWHLKDNDVRKEKVKENFKNKLIQSTENYNSSLKNQIEEKE 228)KLSSKEFKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKLYSELRHPLMHYDYQYFENLFENKENSELTKNLNLDIFKSLPLVRKMKLNNKVNYLEDNDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLEKRISESEKKNEKLKKKLDSMKAHFRNINSEDTKEAYFWDIHSSRNYKTKYNERKNLVNEYTKLLGSSKEKKLLREEITKINRQLLKLKQEMEEITKKNSLFRLEYKMKIAFGFLFCEFDGNISKFKDEFDASNQEKIIQYHKNGEKYLTSFLKEEEKEKFNLEKMQKIIQKTEEEDWLLPETKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFMDENQNNIQVSQTVEKQEDYFYHKIRLFEKNTKKYEIVKYSIVPNEKLKQYFEDLGIDIKYLTGSVESGEKWLGENLGIDIKYLTVEQKSEVSEEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSEKPFEVFLEELKDKMIGKQLNFGQLLYVVYEVLVKNKDLDKILSKKIDYRKDKSFSPEIAYLRNFLSHLNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIQFIEKCNLQNQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKKSEKEEILRKRYHLINKKNEQIKDEHEAQSQLYEKILSLQKIFSCDKNNFYRRLKEEKLLFLEKQGKKKISMKEIKDKIASDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNISFYNHQDKKKEEGIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLMIEKNTLKISSNGKIISLIPHYSYSIDVKY FusobacteriumMTEKKSIIFKNKSSVEIVKKDIFSQTPDNMIRNYKITLKISEKNPR necrophorumVVEAEIEDLMNSTILKDGRRSARREKSMTERKLIEEKVAENYSL subsp.LANCPMEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGK funduliformeDNETEEIWHLKDNDVRKEKVKENFKNKLIQSTENYNSSLKNQIE 1_1_36SEKEKLLRKESKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKL cont1.14 (SEQYSELRHPLMHYDYQYFENLFENKENSELTKNLNLDIFKSLPLVR ID No. 229)KMKLNNKVNYLEDNDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLEKRISESEKKNEKLKKKFDSMKAHFHNINSEDTKEAYFWDIHSSSNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITQINRKLLKLKQEMEEITKKNSLFRLEYKMKIAFGFLFCEFDGNISKFKDEFDASNQEKIIQYHKNGEKYLTYFLKEEEKEKFNLEKMQKIIQKTEEEDWLLPETKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFMDENQNNIQVSQTVEKQEDYFYHKIRLFEKNTKKYEIVKYSIVPNEKLKQYFEDLGIDIKYLTGSVESGEKWLGENLGIDIKYLTVEQKSEVSEEKIKK FL FusobacteriumMGKPNRSSIIKIIISNYDNKGIKEVKVRYNKQAQLDTFLIKSELK perfoetensDGKFILYSIVDKAREKYRYSFEIDKTNINKNEILIIKKDIYSNKED ATCC 29250KVIRKYILSFEVSEKNDRTIVTKIKDCLETQKKEKFERENTRRLIST364DRAFT_scaffold00009.9_CETERKLLSEETQKTYSKIACCSPEDIDSVKIYKIKRYLAYRSNML (SEQ IDLFFSLINDIFVKGVVKDNGEEVGEIWRIIDSKEIDEKKTYDLLVE No. 230)NFKKRMSQEFINYKQSIENKIEKNTNKIKEIEQKLKKEKYKKEINRLKKQLIELNRENDLLEKDKIELSDEEIREDIEKILKIYSDLRHKLMHYNYQYFENLFENKKISKEKNEDVNLTELLDLNLFRYLPLVRQLKLENKTNYLEKEDKITVLGVSDSAIKYYSYYNFLCEQKNGFNNFINSFFSNDGEENKSFKEKINLSLEKEIEIMEKETNEKIKEINKNELQLMKEQKELGTAYVLDIHSLNDYKISHNERNKNVKLQNDIMNGNRDKNALDKINKKLVELKIKMDKITKRNSILRLKYKLQVAYGFLMEEYKGNIKKFKDEFDISKEKIKSYKSKGEKYLEVKSEKKYITKILNSIEDIHNITWLKNQEENNLFKFYVLTYILLPFEFRGDFLGFVKKHYYDIKNVEFLDENNDRLTPEQLEKMKNDSFFNKIRLFEKNSKKYDILKESILTSERIGKYFSLLNTGAKYFEYGGEENRGIFNKNIIIPIFKYYQIVLKLYNDVELAMLLTLSESDEKDINKIKELVTLKEKVSPKKIDYEKKYKFSVLLDCFNRIINLGKKDFLASEEVKEVAKTFTNLAYLRNKICHLNYSKFIDDLLTIDTNKSTTDSEGKLLINDRIRKLIKFIRENNQKMNISIDYNYINDYYMKKEKFIFGQRKQAKTIIDSGKKANKRNKAEELLKMYRVKKENINLIYELSKKLNELTKSELFLLDKKLLKDIDFTDVKIKNKSFFELKNDVKEVANIKQALQKHSSELIGIYKKEVIMAIKRSIVSKLIYDEEKVLSIIIYDKTNKKYEDFLLEIRRERDINKFQFLIDEKKEKLGYEKIIETKEKKKVVVKIQNNSELVSEPRIIKNKDKKKAKTPEEISKLGILDLTNHYCFNLKI TL FusobacteriumMENKGNNKKIDFDENYNILVAQIKEYFTKEIENYNNRIDNIIDKK ulcerans ATCCELLKYSEKKEESEKNKKLEELNKLKSQKLKILTDEEIKADVIKII 49185 cont2.38KIFSDLRHSLMHYEYKYFENLFENKKNEELAELLNLNLFKNLTL (SEQ ID No.LRQMKIENKTNYLEGREEFNIIGKNIKAKEVLGHYNLLAEQKNG 231)FNNFINSFFVQDGTENLEFKKLIDEHFVNAKKRLERNIKKSKKLEKELEKMEQHYQRLNCAYVWDIHTSTTYKKLYNKRKSLIEEYNKQINEIKDKEVITAINVELLRIKKEMEEITKSNSLFRLKYKMQIAYAFLEIEFGGNIAKFKDEFDCSKMEEVQKYLKKGVKYLKYYKDKEAQKNYEFPFEEIFENKDTHNEEWLENTSENNLFKFYILTYLLLPMEFKGDFLGVVKKHYYDIKNVDFTDESEKELSQVQLDKMIGDSFFHKIRLFEKNTKRYEIIKYSILTSDEIKRYFRLLELDVPYFEYEKGTDEIGIFNKNIILTIFKYYQIIFRLYNDLEIHGLFNISSDLDKILRDLKSYGNKNINFREFLYVIKQNNNSSTEEEYRKIWENLEAKYLRLHLLTPEKEEIKTKTKEELEKLNEISNLRNGICHLNYKEIIEEILKTEISEKNKEATLNEKIRKVINFIKENELDKVELGFNFINDFFMKKEQFMFGQIKQVKEGNSDSITTERERKEKNNKKLKETYELNCDNLSEFYETSNNLRERANSSSLLEDSAFLKKIGLYKVKNNKVNSKVKDEEKRIENIKRKLLKDSSDIMGMYKAEVVKKLKEKLILIFKHDEEKRIYVTVYDTSKAVPENISKEILVKRNNSKEEYFFEDNNKKYVTEYYTLEITETNELKVIPAKKLEGKEFKTEKNKENKLMLNNHYC FNVKIIY AnaerosalibacterMKSGRREKAKSNKSSIVRVIISNFDDKQVKEIKVLYTKQGGIDVI sp. ND1KFKSTEKDEKGRMKFNFDCAYNRLEEEEFNSFGGKGKQSFFVT genomeTNEDLTELHVTKRHKTTGEIIKDYTIQGKYTPIKQDRTKVTVSIT assemblyDNKDHFDSNDLGDKIRLSRSLTQYTNRILLDADVMKNYREIVCS AnaerosalibacterDSEKVDETINIDSQEIYKINRFLSYRSNMIIYYQMINNFLLHYDG massiliensisEEDKGGNDSINLINEIWKYENKKNDEKEKIIERSYKSIEKSINQYI ND1 (SEQ IDLNHNTEVESGDKEKKIDISEERIKEDLKKTFILFSRLRHYMVHYN No. 232)YKFYENLYSGKNFIIYNKDKSKSRRFSELLDLNIFKELSKIKLVKNRAVSNYLDKKTTIHVLNKNINAIKLLDIYRDICETKNGFNNFINNMMTISGEEDKEYKEMVTKHFNENMNKLSIYLENFKKHSDFKTNNKKKETYNLLKQELDEQKKLRLWFNAPYVYDIHSSKKYKELYVERKKYVDIHSKLIEAGINNDNKKKLNEINVKLCELNTEMKEMTKLNSKYRLQYKLQLAFGFILEEFNLDIDKFVSAFDKDNNLTISKFMEKRETYLSKSLDRRDNRFKKLIKDYKFRDTEDIFCSDRENNLVKLYILMYILLPVEIRGDFLGFVKKNYYDLKHVDFIDKRNNDNKDTFFHDLRLFEKNVKRLEVTSYSLSDGFLGKKSREKFGKELEKFIYKNVSIALPTNIDIKEFNKSLVLPMMKNYQIIFKLLNDIEISALFLIAKKEGNEGSITFKKVIDKVRKEDMNGNINFSQVMKMALNEKVNCQIRNSIAHINMKQLYIEPLNIYINNNQNKKTISEQMEEIIDICITKGLTGKELNKNIINDYYMKKEKLVFNLKLRKRNNLVSIDAQQKNMKEKSILNKYDLNYKDENLNIKEIILKVNDLNNKQKLLKETTEGESNYKNALSKDILLLNGIIRKNINFKIKEMILGIIQQNEYRYVNINIYDKIRKEDHNIDLKINNKYIEISCYENKSNESTDERINFKIKYMDLKVKNELLVPSCYEDIYIKKKIDLEIRYIENCKVVYIDIYYKKYNINLEFDGKTLFVKFNKDVKKNNQKVNLESNYIQNIKFIV S

TABLE 4 Name Sequence EH019081MTEKKSIIFKNKSSVEIVKKDIFSQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSARREKSMTERKLIEEKVAENYSLLANCPMEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGKDNETEEIWHLKDNDVRKEKVKENFKNKLIQSTENYNSSLKNQIEEKEKLLRKESKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKLYSELRHPLMHYDYQYFENLFENKENSELTKNLNLDIFKSLPLVRKMKLNNKVNYLEDNDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLEKRISESEKKNEKLKKKFDSMKAHFHNINSEDTKEAYFWDIHSSSNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITQINRKLLKLKQEMEEITKKNSLFRLEYKMKIAFGFLFCEFDGNISKFKDEFDASNQEKIIQYHKNGEKYLTYFLKEEEKEKFNLEKMQKIIQKTEEEDWLLPETKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFMDENQNNIQVSQTVEKQEDYFYHKIRLFEKNTKKYEIVKYSIVPNEKLKQYFEDLGIDIKYLTGSVESGEKWLGENLGIDIKYLTVEQKSEVSEEKIKKFL WP_094899336MEKDKKGEKIDISQEMIEEDLRKILILFSRLRHSMVHYDYEFYQALYSGKDFVISDKNNLENRMISQLLDLNIFKELSKVKLIKDKAISNYLDKNTTIHVLGQDIKAIRLLDIYRDICGSKNGFNKFINTMITISGEEDREYKEKVIEHFNKKMENLSTYLEKLEKQDNAKRNNKRVYNLLKQKLIEQQKLKEWFGGPYVYDIHSSKRYKELYIERKKLVDRHSKLFEEGLDEKNKKELTKINDELSKLNSEMKEMTKLNSKYRLQYKLQLAFGFILEEFDLNIDTFINNFDKDKDLIISNFMKKRDIYLNRVLDRGDNRLKNIIKEYKFRDTEDIFCNDRDNNLVKLYILMYILLPVEIRGDFLGFVKKNYYDMKHVDFIDKKDKEDKDTFFHDLRLFEKNIRKLEITDYSLSSGFLSKEHKVDIEKKINDFINRNGAMKLPEDITIEEFNKSLILPIMKNYQINFKLLNDIEISALFKIAKDRSITFKQAIDEIKNEDIKKNSKKNDKNNHKDKNINFTQLMKRALHEKIPYKAGMYQIRNNISHIDMEQLYIDPLNSYMNSNKNNITISEQIEKIIDVCVTGGVTGKELNNNIINDYYMKKEKLVFNLKLRKQNDIVSIESQEKNKREEFVFKKYGLDYKDGEINIIEVIQKVNSLQEELRNIKETSKEKLKNKETLFRDISLINGTIRKNINFKIKEMVLDIVRMDEIRHINIHIYYKGENYTRSNIIKFKYAIDGENKKYYLKQHEINDINLELKDKFVTLICNMDKHPNKNKQTINLESNYIQNVKFIIP WP_040490876MENKGNNKKIDFDENYNILVAQIKEYFTKEIENYNNRIDNIIDKKELLKYSEKKEESEKNKKLEELNKLKSQKLKILTDEEIKADVIKIIKIFSDLRHSLMHYEYKYFENLFENKKNEELAELLNLNLFKNLTLLRQMKIENKTNYLEGREEFNIIGKNIKAKEVLGHYNLLAEQKNGFNNFINSFFVQDGTENLEFKKLIDEHFVNAKKRLERNIKKSKKLEKELEKMEQHYQRLNCAYVWDIHTSTTYKKLYNKRKSLIEEYNKQINEIKDKEVITAINVELLRIKKEMEEITKSNSLFRLKYKMQIAYAFLEIEFGGNIAKFKDEFDCSKMEEVQKYLKKGVKYLKYYKDKEAQKNYEFPFEEIFENKDTHNEEWLENTSENNLFKFYILTYLLLPMEFKGDFLGVVKKHYYDIKNVDFTDESEKELSQVQLDKMIGDSFFHKIRLFEKNTKRYEIIKYSILTSDEIKRYFRLLELDVPYFEYEKGTDEIGIFNKNIILTIFKYYQIIFRLYNDLEIHGLFNISSDLDKILRDLKSYGNKNINFREFLYVIKQNNNSSTEEEYRKIWENLEAKYLRLHLLTPEKEEIKTKTKEELEKLNEISNLRNGICHLNYKEIIEEILKTEISEKNKEATLNEKIRKVINFIKENELDKVELGFNFINDFFMKKEQFMFGQIKQVKEGNSDSITTERERKEKNNKKLKETYELNCDNLSEFYETSNNLRERANSSSLLEDSAFLKKIGLYKVKNNKVNSKVKDEEKRIENIKRKLLKDSSDIMGMYKAEVVKKLKEKLILIFKHDEEKRIYVTVYDTSKAVPENISKEILVKRNNSKEEYFFEDNNKKYVTEYYTLEITETNELKVIPAKKLEGKEFKTEKNKENKLMLNNHYC FNVKIIYWP_047396607MEEIKHKKNKSSIIRVIVSNYDMTGIKEIKVLYQKQGGVDTFNLKTIINLESGNLEIISCKPKEREKYRYEFNCKTEINTISITKKDKVLKKEIRKYSLELYFKNEKKDTVVAKVTDLLKAPDKIEGERNHLRKLSSSTERKLLSKTLCKNYSEISKTPIEEIDSIKIYKIKRFLNYRSNFLIYFALINDFLCAGVKEDDINEVWLIQDKEHTAFLENRIEKITDYIFDKLSKDIENKKNQFEKRIKKYKTSLEELKTETLEKNKTFYIDSIKTKITNLENKITELSLYNSKESLKEDLIKIISIFTNLRHSLMHYDYKSFENLFENIENEELKNLLDLNLFKSIRMSDEFKTKNRTNYLDGTESFTIVKKHQNLKKLYTYYNNLCDKKNGFNTFINSFFVTDGIENTDFKNLIILHFEKEMEEYKKSIEYYKIKISNEKNKSKKEKLKEKIDLLQSELINMREHKNLLKQIYFFDIHNSIKYKELYSERKNLIEQYNLQINGVKDVTAINHINTKLLSLKNKMDKITKQNSLYRLKYKLKIAYSFLMIEFDGDVSKFKNNFDPTNLEKRVEYLDKKEEYLNYTAPKNKFNFAKLEEELQKIQSTSEMGADYLNVSPENNLFKFYILTYIMLPVEFKGDFLGFVKNHYYNIKNVDFMDESLLDENEVDSNKLNEKIENLKDSSFFNKIRLFEKNIKKYEIVKYSVSTQENMKEYFKQLNLDIPYLDYKSTDEIGIFNKNMILPIFKYYQNVFKLCNDIEIHALLALANKKQQNLEYAIYCCSKKNSLNYNELLKTFNRKTYQNLSFIRNKIAHLNYKELFSDLFNNELDLNTKVRCLIEFSQNNKFDQIDLGMNFINDYYMKKTRFIFNQRRLRDLNVPSKEKIIDGKRKQQNDSNNELLKKYGLSRTNIKDIFNKAWY WP_035935671MKVRYRKQAQLDTFIIKTEIVNNDIFIKSIIEKAREKYRYSFLFDGEEKYHFKNKSSVEIVKNDIFSQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSARREKSMTERKLIEEKVAENYSLLANCPIEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGKDNETEEIWHLKDNDVRKEKVKENFKNKLIQSTENYNSSLKNQIEEKEKLSSKEFKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKLYSELRHPLMHYDYQYFENLFENKENSELTKNLNLDIFKSLPLVRKMKLNNKVNYLEDNDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLEKRISESEKKNEKLKKKLDSMKAHFRNINSEDTKEAYFWDIHSSRNYKTKYNERKNLVNEYTKLLGSSKEKKLLREEITKINRQLLKLKQEMEEITKKNSLFRLEYKMKIAFGFLFCEFDGNISKFKDEFDASNQEKIIQYHKNGEKYLTSFLKEEEKEKFNLEKMQKIIQKTEEEDWLLPETKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFMDENQNNIQVSQTVEKQEDYFYHKIRLFEKNTKKYEIVKYSIVPNEKLKQYFEDLGIDIKYLTGSVESGEKWLGENLGIDIKYLTVEQKSEVSEEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSEKPFEVFLEELKDKMIGKQLNFGQLLYVVYEVLVKNKDLDKILSKKIDYRKDKSFSPEIAYLRNFLSHLNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIQFIEKCNLQNQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKKSEKEEILRKRYHLINKKNEQIKDEHEAQSQLYEKILSLQKIFSCDKNNFYRRLKEEKLLFLEKQGKKKISMKEIKDKIASDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNISFYNHQDKKKEEGIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLMIEKNTLKISSNGKIISLIPHYSYSIDVKY WP_035906563MEKFRRQNRSSIIKIIISNYDTKGIKELKVRYRKQAQLDTFIIKTEIVNNDIFIKSIIEKAREKYRYSFLFDGEEKYHFKNKSSVEIVKKDIFSQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSARREKSMTERKLIEEKVAENYSLLANCPMEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGKDNETEEIWHLKDNDVRKEKVKENFKNKLIQSTENYNSSLKNQIEEKEKLLRKESKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKLYSELRHPLMHYDYQYFENLFENKENSELTKNLNLDIFKSLPLVRKMKLNNKVNYLEDNDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEIEFLEKRISESEKKNEKLKKKLDSMKAHFRNINSEDTKEAYFWDIHSSRNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITKINRQLLKLKQEMEEITKKNSLFRLEYKMKMAFGFLFCEFDGNISRFKDEFDASNQEKIIQYHKNGEKYLTYFLKEEEKEKFNLKKLQETIQKTGEENWLLPQNKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFMDENQSSKIIESKEDDFYHKIRLFEKNTKKYEIVKYSIVPDKKLKQYFKDLGIDTKYLILDQKSEVSGEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSGKPFEVFLKELKDKMIGKQLNFGQLLYVVYEVLVKNKDLSEILSERIDYRKDMCFSAEIADLRNFLSHNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIKFIEECNLQSQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKMNEKEEILRNRYHLTDKKNEQIKDEHEAQSQLYEKILSLQKIYSSDKNNFYGRLKEEKLLFLEKQEKKKLSMEEIKDKIAGDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNLSFYNHQDKKKEESIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLIIEKNTLKISSNGKIISLIPHYSYSIDVKY WP_042678931MKSGRREKAKSNKSSIVRVIISNFDDKQVKEIKVLYTKQGGIDVIKFKSTEKDEKGRMKFNFDCAYNRLEEEEFNSFGGKGKQSFFVTTNEDLTELHVTKRHKTTGEIIKDYTIQGKYTPIKQDRTKVTVSITDNKDHFDSNDLGDKIRLSRSLTQYTNRILLDADVMKNYREIVCSDSEKVDETINIDSQEIYKINRFLSYRSNMIIYYQMINNFLLHYDGEEDKGGNDSINLINEIWKYENKKNDEKEKIIERSYKSIEKSINQYILNHNTEVESGDKEKKIDISEERIKEDLKKTFILFSRLRHYMVHYNYKFYENLYSGKNFIIYNKDKSKSRRFSELLDLNIFKELSKIKLVKNRAVSNYLDKKTTIHVLNKNINAIKLLDIYRDICETKNGFNNFINNMMTISGEEDKEYKEMVTKHFNENMNKLSIYLENFKKHSDFKTNNKKKETYNLLKQELDEQKKLRLWFNAPYVYDIHSSKKYKELYVERKKYVDIHSKLIEAGINNDNKKKLNEINVKLCELNTEMKEMTKLNSKYRLQYKLQLAFGFILEEFNLDIDKFVSAFDKDNNLTISKFMEKRETYLSKSLDRRDNRFKKLIKDYKFRDTEDIFCSDRENNLVKLYILMYILLPVEIRGDFLGFVKKNYYDLKHVDFIDKRNNDNKDTFFHDLRLFEKNVKRLEVTSYSLSDGFLGKKSREKFGKELEKFIYKNVSIALPTNIDIKEFNKSLVLPMMKNYQIIFKLLNDIEISALFLIAKKEGNEGSITFKKVIDKVRKEDMNGNINFSQVMKMALNEKVNCQIRNSIAHINMKQLYIEPLNIYINNNQNKKTISEQMEEIIDICITKGLTGKELNKNIINDYYMKKEKLVFNLKLRKRNNLVSIDAQQKNMKEKSILNKYDLNYKDENLNIKEIILKVNDLNNKQKLLKETTEGESNYKNALSKDILLLNGIIRKNINFKIKEMILGIIQQNEYRYVNINIYDKIRKEDHNIDLKINNKYIEISCYENKSNESTDERINFKIKYMDLKVKNELLVPSCYEDIYIKKKIDLEIRYIENCKVVYIDIYYKKYNINLEFDGKTLFVKFNKDVKKNNQKVNLESNYIQNIKFIVS WP_062627846MEKFRRQNRNSIIKIIISNYDTKGIKELKVRYRKQAQLDTFIIKTEIVNNDIFIKSIIEKAREKYRYSFLFDGEEKYHFKNKSSVEIVKKDIFSQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSARREKSVTERKLIEEKVAENYSLLANCPMEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGKENETEEIWHLKDNDVRKEKVKENFKNKLIQSTENYNSSLKNQIEEKEKLLRKESKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKLYSKLRHSLMHYDYQYFENLFENKETPELKDKLDLHLFKSLPLIRKMKLNNKVNYLEDGDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLGKRISESEEKNPKLKKKFDSMKAHFHNINSEDTKEAYFWDIHSSSNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITQINRKLLKLKQEMEEITKKNSLFRLEYKMKMAFGFLFCEFDGNISRFKDEFDASNQEKIIQYHKNGEKYLTYFLKEEEKEKFNLKKLQETIQKTGKENWLLPQNKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFMDENQSSKIIESKEDDFYHKIRLFEKNTKKYEIVKYSIVPDEKLKQYFKDLGIDTKYLILEQKSEVSGEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSGKPFEVFLKELKDKMIGKQLNFGQLLYVIYEVLVKNKDLSEILSERIDYRKDMCFSAEIADLRNFLSHLNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIKFIEECNLQSQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKMNEKEEILRNRYHLTDKKNEQIKDEHEAQSQLYEKILSLQKIYSSDKNNFYGRLKEEKLLFLGKQGKKKLSMEEIKDKIAGDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNLSFYNHQDKKKEESIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLMIEKNTLKISSNGKIISLVPHYSYSIDVKY WP_005959231MEKFRRQNRNSIIKIIISNYDTKGIKELKVRYRKQAQLDTFIIKTEIVNNDIFIKSIIEKAREKYRYSFLFDGEEKYHFKNKSSVEIVKKDIFSQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSARREKSMTERKLIEEKVAKNYSLLANCPMEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGKDNETEEIWHLKDNDVRKEKVRENFKNKLIQSTENYNSSLKNQIEEKEKLLRKEFKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKLYSKLRHSLMHYDYQYFENLFENKKNDDLMKDLNLDLFKSLPLIRKMKLNNKVNYLEDGDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLEKRISESEKKNEKLKKKLDSMKAHFRNINSEDTKEAYFWDIHSSRNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITKINRQLLKLKQEMEEITKKNSLFRLEYKMKIAFGFLFCEFDGNISKFKDEFDASNQEKIIQYHKNGEKYLTSFLKEEEKEKFNLEKMQKIIQKTEEEDWLLPETKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFIDENQNNIQVSQTVEKQEDYFYHKIRLFEKNTKKYEIVKYSIVPNEKLKQYFEDLGIDIKYLTVEQKSEVSEEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSGKPLEIFRKELESKMKDGYLNFGQLLYVVYEVLVKNKDLDKILSKKIDYRKDKSFSPEIAYLRNFLSHLNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIQFIEKCNLQNQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKMNEKEEILRNRYHLTDKKNEQIKDEHEAQSQLYEKILSLQKIYSSDKNNFYGRLKEEKLLFLEKQGKKKLSMEEIKDKIAGDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNLSFYNHQDKKKEESIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLIIEKNTLKISSNGKIISLIPHYSYSIDVKY WP_027128616MGKPNRSSIIKIIISNYDNKGIKEVKVRYNKQAQLDTFLIKSELKDGKFILYSIVDKAREKYRYSFEIDKTNINKNEILIIKKDIYSNKEDKVIRKYILSFEVSEKNDRTIVTKIKDCLETQKKEKFERENTRRLISETERKLLSEETQKTYSKIACCSPEDIDSVKIYKIKRYLAYRSNMLLFFSLINDIFVKGVVKDNGEEVGEIWRIIDSKEIDEKKTYDLLVENFKKRMSQEFINYKQSIENKIEKNTNKIKEIEQKLKKEKYKKEINRLKKQLIELNRENDLLEKDKIELSDEEIREDIEKILKIYSDLRHKLMHYNYQYFENLFENKKISKEKNEDVNLTELLDLNLFRYLPLVRQLKLENKTNYLEKEDKITVLGVSDSAIKYYSYYNFLCEQKNGFNNFINSFFSNDGEENKSFKEKINLSLEKEIEIMEKETNEKIKEINKNELQLMKEQKELGTAYVLDIHSLNDYKISHNERNKNVKLQNDIMNGNRDKNALDKINKKLVELKIKMDKITKRNSILRLKYKLQVAYGFLMEEYKGNIKKFKDEFDISKEKIKSYKSKGEKYLEVKSEKKYITKILNSIEDIHNITWLKNQEENNLFKFYVLTYILLPFEFRGDFLGFVKKHYYDIKNVEFLDENNDRLTPEQLEKMKNDSFFNKIRLFEKNSKKYDILKESILTSERIGKYFSLLNTGAKYFEYGGEENRGIFNKNIIIPIFKYYQIVLKLYNDVELAMLLTLSESDEKDINKIKELVTLKEKVSPKKIDYEKKYKFSVLLDCFNRIINLGKKDFLASEEVKEVAKTFTNLAYLRNKICHLNYSKFIDDLLTIDTNKSTTDSEGKLLINDRIRKLIKFIRENNQKMNISIDYNYINDYYMKKEKFIFGQRKQAKTIIDSGKKANKRNKAEELLKMYRVKKENINLIYELSKKLNELTKSELFLLDKKLLKDIDFTDVKIKNKSFFELKNDVKEVANIKQALQKHSSELIGIYKKEVIMAIKRSIVSKLIYDEEKVLSIIIYDKTNKKYEDFLLEIRRERDINKFQFLIDEKKEKLGYEKIIETKEKKKVVVKIQNNSELVSEPRIIKNKDKKKAKTPEEISKLGILDLTNHYCFNLKITL WP_062624740MEKFRRQNRNSIIKIIISNYDTKGIKELKVRYRKQAQLDTFIIKTEIVNNDIFIKSIIEKAREKYRYSFLFDGEEKYHFKNKSSVEIVKKDIFSQTPDNMIRNYKITLKISEKNPRVVEAEIEDLMNSTILKDGRRSARREKSMTERKLIEEKVAKNYSLLANCPMEEVDSIKIYKIKRFLTYRSNMLLYFASINSFLCEGIKGKDNETEEIWHLKDNDVRKEKVRENFKNKLIQSTENYNSSLKNQIEEKEKLLRKEFKKGAFYRTIIKKLQQERIKELSEKSLTEDCEKIIKLYSKLRHSLMHYDYQYFENLFENKKNDDLMKDLNLDLFKSLPLIRKMKLNNKVNYLEDGDTLFVLQKTKKAKTLYQIYDALCEQKNGFNKFINDFFVSDGEENTVFKQIINEKFQSEMEFLEKRISESEKKNEKLKKKLDSMKAHFRNINSEDTKEAYFWDIHSSRNYKTKYNERKNLVNEYTELLGSSKEKKLLREEITKINRQLLKLKQEMEEITKKNSLFRLEYKMKIAFGFLFCEFDGNISKFKDEFDASNQEKIIQYHKNGEKYLTSFLKEEEKEKFNLEKMQKIIQKTEEEDWLLPETKNNLFKFYLLTYLLLPYELKGDFLGFVKKHYYDIKNVDFIDENQNNIQVSQTVEKQEDYFYHKIRLFEKNTKKYEIVKYSIVPNEKLKQYFEDLGIDIKYLTGSVESGEKWLGENLGIDIKYLTVEQKSEVSEEKNKKVSLKNNGMFNKTILLFVFKYYQIAFKLFNDIELYSLFFLREKSGKPLEIFRKELESKMKDGYLNFGOLLYVVYEVLVKNKDLDKILSKKIDYRKDKSFSPEIAYLRNFLSHLNYSKFLDNFMKINTNKSDENKEVLIPSIKIQKMIQFIEKCNLQNQIDFDFNFVNDFYMRKEKMFFIQLKQIFPDINSTEKQKMNEKEEILRNRYHLTDKKNEQIKDEHEAQSQLYEKILSLQKIYSSDKNNFYGRLKEEKLLFLEKQGKKKLSMEEIKDKIAGDISDLLGILKKEITRDIKDKLTEKFRYCEEKLLNLSFYNHQDKKKEESIRVFLIRDKNSDNFKFESILDDGSNKIFISKNGKEITIQCCDKVLETLIIEKNTLKISSNGKIISLIPHYSYSIDVKYWP_096402050MENKNKPNRGSIVRIIISNYDMKGIKELKVRYRKQAQLDTFILQTTLDKSNNSILINDFRVKAREKYRYSFTYDGKEKFSVPSNSIIVTKIDNAAPEKSKEIRKYKITLGIDEKCKTGSMITAAIEDLLEDDRVREGIRNPRRKASKTERKLITESICHNYAQITQCPVEEIDAVKIYKVKRFLSYRSNMLLFFALINDFLCKNLKNEKGEKINEIWEMENKGNNKKIDFDENYNILVAQIKEYFTKEIENYNNRIDNIIDKKELLKYSEEKEESEKNKKLEELNKLESQKLKILTDEEIKADVIKIIKIFSDLRHSLMHYEYKYFENLFENKKNEELAELLNLNLFKNLTLLRQMKIENKTNYLEGDEKFNILGKDVRAKNALGHYDLLVEQKNGFNNFINSFFVQDGTENLEFKKFIDENFIKAQKELEEDIKNCKESVKKLEKKLKENPKKSEDLEKKLEKKQKKLKELKKELEKMKQHYKRLNCAYVWDIHSSTVYKKLYNERKNLIEKYNKQLNGLQDKNAITGINAQLLRIKKEMEEITKSNSLFRLKYKMQIAYAFLEMEYEGNIAKFKNEFDCSKTEKIQEWLEKSEEYLNYCMEKEEDGKNYKFHFKEISEIKDTHNEEWLENTSENNLFKFYILTYLLLPMEFKGDFLGVVKKHYYDIKNVDFTDESEKELSQEQIDKMIGDSFFHKIRLFEKNTKRYEIIKYSILTSDEIKKYFELLELKVPYLEYKGIDEIGIFNKNIILPIFKYYQIIFRLYNDLEIHGLFNVSFDINKILSDLKSYGNENINFREFLYVIKQNNNSSTEEEYQKIWEKLESKYLKEPLLTPEKKEINKKTEKELKKLDGISFLRNKISHLEYEKIIEGVLKTAVNGENKKTSETNADKVFLNEKIKKIINFIKENELDKIELGFNFINDFFMKKEQFMFGQIKQVKEGNSDSITTERKRKEENNKRLKITYGLNYNNLSKIYEFSNTLREIVNSPLFLKDSTLLKKVDLSKVMLKEKPICSLQYENNTKLEDDIKRILLKDSSDIMGIYKAEVVKKLKEKLVLIFKYDEEKKIYVTVYDTSKAVPENISKEILVKRNNSKEEYFFEDNKKKYTTQYYTLEITKENELKVIPAKKLEGKEFKTEKKEENKLMLNNHYCFNVKIIY

In certain example embodiments, the CRISPR effector protein is a Cas13dprotein selected from Table 5.

TABLE 5 RfxCas13d MIEKKKSFAKGMGVKSTLVSGSKVYMTTFAEGSDARLEKIVEGDSIRS(SEQ ID VNEGEAFSAEMADKNAGYKIGNAKFSHPKGYAVVANNPLYTGPVQQ NO: 233)DMLGLKETLEKRYFGESADGNDNICIQVIHNILDIEKILAEYITNAAYAVNNISGLDKDIIGFGKFSTVYTYDEFKDPEHHRAAFNNNDKLINAIKAQYDEFDNFLDNPRLGYFGQAFFSKEGRNYIINYGNECYDILALLSGLRHWVVHNNEEESRISRTWLYNLDKNLDNEYISTLNYLYDRITNELTNSFSKNSAANVNYIAETLGINPAEFAEQYFRFSIMKEQKNLGFNITKLREVMLDRKDMSEIRKNHKVFDSIRTKVYTMMDFVIYRYYIEEDAKVAAANKSLPDNEKSLSEKDIFVINLRGSFNDDQKDALYYDEANRIWRKLENIMHNIKEFRGNKTREYKKKDAPRLPRILPAGRDVSAFSKLMYALTMFLDGKEINDLLTTLINKFDNIQSFLKVMPLIGVNAKFVEEYAFFKDSAKIADELRLIKSFARMGEPIADARRAMYIDAIRILGTNLSYDELKALADTFSLDENGNKLKKGKHGMRNFIINNVISNKRFHYLIRYGDPAHLHEIAKNEAVVKFVLGRIADIQKKQGQNGKNQIDRYYETCIGKDKGKSVSEKVDALTKIITGMNYDQFDKKRSVIEDTGRENAEREKFKKIISLYLTVIYHILKNIVNINARYVIGFHCVERDAQLYKEKGYDINLKKLEEKGFSSVTKLCAGIDETAPDKRKDVEKEMAERAKESIDSLESANPKLYANYIKYSDEKKAEEFTRQINREKAKTALNAYLRNTKWNVIIREDLLRIDNKTCTLFRNKAVHLEVARYVHAYINDIAEVNSYFQLYHYIMQRIIMNERYEKSSGKVSEYFDAVNDEKKYNDRLLKLLCVPFGYCIPRFKNLSIEALFDRNEAAKFDK EKKKVSGNS AdmCas13dMNNKRKTKAKAAGLKSVFFDQKQAVLTTFAKGNNSQIEKKVVNSEV (SEQ IDKDLRQPPAFDLELKEKTFYISGKNNINTSRENPLASASLPLSKRQRIRA NO: 234)ERIKRAREENRPYHNVKRVGEDDLRAKADLEKHYFGKEYSDNLKIQIIYNILDINKIISPYINDIVYSMNNLARNDEYIDGKIDVIGSLSSTTDYSSFMSPNKDLEKEKKFSFHRENYKKFVEASKPYMRYYGKVFIRDVKKSKLSTGKGEKIEVMYRSDEEIFTIFQILSYVRQSIMHNDIGNKSSILAIEKYPARFVGFLSDLLKTKTNDVNRMFIDNNSQTNFWVLFSIFGLQDHTSGADKICRNFYDFVIKADSKNLGFSLKKIRELMLDLPNANMLRDHQFDTVRSKFYTLLDFIIYQHYLEEKSRIDNMVEKLRMTLKEEEKEVLYAAEAKIVWNAIGAKVINKLVPMMNGDALKEIKRKNRDRKLPQSVIATVQVNSDANVFSGLIYFLTLFLDGKEINEMVSNLITKFENIDSLLHVDREIYKSDEKDLDLEIEKLALFFKGVVRPNAKTDTGAGEISKSFSIFQSAERIIEELKFIKNVTRMDNEIFPSEGVFLDAANVLGVRGDDFDFSNEFVGDDLHSDANKKIINKINGTKEDRNLRNFIINNVVKSRRFQYIAREININTHYVKQLANNETLNRFVLNKMGDAKIINRYYESISGNTPNIEVRSQIDYLVKRLRSFSFEDLNDVKQKVRPGTNESIEKEKKKALVGLCLTIQYLVYKNLVNINARYTTAFYCLERDSKLKGFGVDVWRDFESYTALTNHFIKEGYLPVRKAEILRANLKHLDCEDGFKYYRNQVTHLNAIRVAYKYINEIKSVHSYFALYHYIMQRHLYDSLQAKAKDSSGFVIDALKKSFEHKIYSKDLLHVLHSPFGYNTARYKNLSIEALFDKNESRPEVNPLSTND UrCas13dMAKKNKMKPRELREAQKKARQLKAAEINNNAAPAIAAMPAAEVIAP (SEQ IDAAEKKKSSVKAAGMKSILVSENKMYITSFGKGNSAVLEYEVDNNDY NO: 235)NQTQLSSKDNSNIQLGGVNEVNITFSSKHGFESGVEINTSNPTHRSGESSPVRGDMLGLKSELEKRFFGKTFDDNIHIQLIYNILDIEKILAVYVTNIVYALNNMLGVKGSESHDDFIGYLSTNNIYDVFIDPDNSSLSDDKKANVRKSLSKFNALLKTKRLGYFGLEEPKTKDNRVSQAYKKRVYHMLAIVGQIRQCVFHDKSGAKRFDLYSFINNIDPEYRDTLDYLVEERLKSINKDFIEDNKVNISLLIDMMKGYEADDIIRLYYDFIVLKSQKNLGFSIKKLREKMLDEYGFRFKDKQYDSVRSKMYKLMDFLLFCNYYRNDIAAGESLVRKLRFSMTDDEKEGIYADEAAKLWGKFRNDFENIADHMNGDVIKELGKADMDFDEKILDSEKKNASDLLYFSKMIYMLTYFLDGKEINDLLTTLISKFDNIKEFLKIMKSSAVDVECELTAGYKLFNDSQRITNELFIVKNIASMRKPAASAKLTMFRDALTILGIDDKITDDRISGILKLKEKGKGIHGLRNFITNNVIESSRFVYLIKYANAQKIREVAKNEKVVMFVLGGIPDTQIERYYKSCVEFPDMNSSLGVKRSELARMIKNISFDDFKNVKQQAKGRENVAKERAKAVIGLYLTVMYLLVKNLVNVNARYVIAIHCLERDFGLYKEIIPELASKNLKNDYRILSQTLCELCDKSPNLFLKKNERLRKCVEVDINNADSSMTRKYRNCIAHLTVVRELKEYIGDICTVDSYFSIYHYVMQRCITKRENDTKQEEKIKYEDDLLKNHGYTKDFVKALNSPFGYNIPRFKNLSIEQL FDRNEYLTEK P1E0Cas13dMEREVKKPPKKSLAKAAGLKSTFVISPQEKELAMTAFGRGNDALLQK (SEQ IDRIVDGVVRDVAGEKQQFQVQRQDESRFRLQNSRLADRTVTADDPLH NO: 236)RAETPRRQPLGAGMDQLRRKAILEQKYFGRTFDDNIHIQLIYNILDIHKMLAVPANHIVHTLNLLGGYGETDFVGMLPAGLPYDKLRVVKKKNGDTVDIKADIAAYAKRPQLAYLGAAFYDVTPGKSKRDAARGRVKREQDVYAILSLMSLLRQFCARDSVRIWGQNTTAALYHLQALPQDMKDLLDDGWRRALGGVNDHFLDTNKVNLLTLFEYYGAETKQARVALTQDFYRFVVLKEQKNMGFSLRRLREELLKLPDAAYLTGQEYDSVRQKLYMLLDFLLCRLYAQERADRCEELVSALRCALSDEEKDTVYQAEAAALWQALGDTLRRKLLPLLKGKKLQDKDKKKSDELGLSRDVLDGVLFRPAQQGSRANADYFCRLMHLSTWFMDGKEINTLLTTLISKLENIDSLRSVLESMGLAYSFVPAYAMFDHSRYIAGQLRVVNNIARMRKPAIGAKREMYRAAVVLLGVDSPEAAAAITDDLLQIDPETGKVRPRSDSARDTGLRNFIANNVVESRRFTYLLRYMTPEQARVLAQNEKLIAFVLSTVPDTQLERYCRTCGREDITGRPAQIRYLTAQIMGVRYESFTDVEQRGRGDNPKKERYKALIGLYLTVLYLAVKNMVNCNARYVIAFYCRDRDTALYQKEVCWYDLEEDKKSGKQRQVEDYTALTRYFVSQGYLNRHACGYLRSNIVINGISNSLLTAYRNAVDHLNAIPPLGSLCRDIGRVDSYFALYHYAVQQYLNGRYYRKTPREQELFAAMAQHRTWCSDLVKALNTPFGYNLARYKNLSIDGLF DREGDHVVREDGEKPAERffCas13d MKKKMSLREKREAEKQAKKAAYSAASKNTDSKPAEKKAETPKPAEII (SEQ IDSDNSRNKTAVKAAGLKSTIISGDKLYMTSFGKGNAAVIEQKIDINDYS NO: 237)FSAMKDTPSLEVDKAESKEISFSSHHPFVKNDKLTTYNPLYGGKDNPEKPVGRDMLGLKDKLEERYFGCTFNDNLHIQIIYNILDIEKILAVHSANITTALDHMVDEDDEKYLNSDYIGYMNTINTYDVFMDPSKNSSLSPKDRKNIDNSRAKFEKLLSTKRLGYFGFDYDANGKDKKKNEEIKKRLYHLTAFAGQLRQWSFHSAGNYPRTWLYKLDSLDKEYLDTLDHYFDKRFNDINDDFVTKNATNLYILKEVFPEANFKDIADLYYDFIVIKSHKNMGFSIKKLREKMLECDGADRIKEQDMDSVRSKLYKLIDFCIFKYYHEFPELSEKNVDILRAAVSDTKKDNLYSDEAARLWSIFKEKFLGFCDKIVVWVTGEHEKDITSVIDKDAYRNRSNVSYFSKLMYAMCFFLDGKEINDLLTTLINKFDNIANQIKTAKELGINTAFVKNYDFFNHSEKYVDELNIVKNIARMKKPSSNAKKAMYHDALTILGIPEDMDEKALDEELDLILEKKTDPVTGKPLKGKNPLRNFIANNVIENSRFIYLIKFCNPENVRKIVNNTKVTEFVLKRIPDAQIERYYKSCTDSEMNPPTEKKITELAGKLKDMNFGNFRNVRQSAKENMEKERFKAVIGLYLTVVYRVVKNLVDVNSRYIMAFHSLERDSQLYNVSVDNDYLALTDTLVKEGDNSRSRYLAGNKRLRDCVKQDIDNAKKWFVSDKYNSITKYRNNVAHLTAVRNCAEFIGDITKIDSYFALYHYLIQRQLAKGLDHERSGFDRNYPQYAPLFKWHTYVKDVVKALNAPFGYNIPRFKNLSIDALFDRNEIKKNDGEKKSDD RaCas13dMAKKSKGMSLREKRELEKQKRIQKAAVNSVNDTPEKTEEANVVSVN (SEQ IDVRTSAENKHSKKSAAKALGLKSGLVIGDELYLTSFGRGNEAKLEKKIS NO: 238)GDTVEKLGIGAFEVAERDESTLTLESGRIKDKTARPKDPRHITVDTQGKFKEDMLGIRSVLEKKIFGKTFDDNIHVQLAYNILDVEKIMAQYVSDIVYMLHNTDKTERNDNLMGYMSIRNTYKTFCDTSNLPDDTKQKVENQKREFDKIIKSGRLGYFGEAFMVNSGNSTKLRPEKEIYHIFALMASLRQSYFHGYVKDTDYQGTTWAYTLEDKLKGPSHEFRETIDKIFDEGFSKISKDFGKMNKVNLQILEQMIGELYGSIERQNLTCDYYDFIQLKKHKYLGFSIKRLRETMLETTPAECYKAECYNSERQKLYKLIDFLIYDLYYNRKPARIEEIVDKLRESVNDEEKESIYSVEAKYVYESLSKVLDKSLKNSVSGETIKDLQKRYDDETANRIWDISQHSISGNVNCFCKLIYIMTLMLDGKEINDLLTTLVNKFDNIASFIDVMDELGLEHSFTDNYKMFADSKAICLDLQFINSFARMSKIDDEKSKRQLFRDALVILDIGNKDETWINNYLDSDIFKLDKEGNKLKGARHDFRNFIANNVIKSSRFKYLVKYSSADGMIKLKTNEKLIGFVLDKLPETQIDRYYESCGLDNAVVDKKVRIEKLSGLIRDMKFDDFSGVKTSNKAGDNDKQDKAKYQAIISLYLMVLYQIVKNMIYVNSRYVIAFHCLERDFGMYGKDFGKYYQGCRKLTDHFIEEKYMKEGKLGCNKKVGRYLKNNISCCTDGLINTYRNQVDHFAVVRKIGNYAAYIKSIGSWFELYHYVIQRIVFDEYRFALNNTESNYKNSIIKHHTYCKDMVKALNTPFGYDLPRYKNLSIGDLFDRNNYLNKTKESIDANSSIDSQ EsCas13dMGKKIHARDLREQRKTDRTEKFADQNKKREAERAVPKKDAAVSVKS (SEQ IDVSSVSSKKDNVTKSMAKAAGVKSVFAVGNTVYMTSFGRGNDAVLEQ NO: 239)KIVDTSHEPLNIDDPAYQLNVVTMNGYSVTGHRGETVSAVTDNPLRRFNGRKKDEPEQSVPTDMLCLKPTLEKKFFGKEFDDNIHIQLIYNILDIEKILAVYSTNAIYALNNMSADENIENSDFFMKRTTDETFDDFEKKKESTNSREKADFDAFEKFIGNYRLAYFADAFYVNKKNPKGKAKNVLREDKELYSVLTLIGKLRHWCVHSEEGRAEFWLYKLDELKDDFKNVLDVVYNRPVEEINNRFIENNKVNIQILGSVYKNTDIAELVRSYYEFLITKKYKNMGFSIKKLRESMLEGKGYADKEYDSVRNKLYQMTDFILYTGYINEDSDRADDLVNTLRSSLKEDDKTTVYCKEADYLWKKYRESIREVADALDGDNIKKLSKSNIEIQEDKLRKCFISYADSVSEFTKLIYLLTRFLSGKEINDLVTTLINKFDNIRSFLEIMDELGLDRTFTAEYSFFEGSTKYLAELVELNSFVKSCSFDINAKRTMYRDALDILGIESDKTEEDIEKMIDNILQIDANGDKKLKKNNGLRNFIASNVIDSNRFKYLVRYGNPKKIRETAKCKPAVRFVLNEIPDAQIERYYEACCPKNTALCSANKRREKLADMIAEIKFENFSDAGNYQKANVTSRTSEAEIKRKNQAIIRLYLTVMYIMLKNLVNVNARYVIAFHCVERDTKLYAESGLEVGNIEKNKTNLTMAVMGVKLENGIIKTEFDKSFAENAANRYLRNARWYKLILDNLKKSERAVVNEFRNTVCHLNAIRNININIKEIKEVENYFALYHYLIQKHLENRFADKKVERDTGDFISKLEEHKTYCKDFVKAYCTPFGYNLVRYKNLTIDGLFDKNYPGKDDSD EQK

Cas13 Variants and Mutations

The present disclosure provides for variants and mutated forms of Casproteins. In some examples, the present disclosure includes variants andmutated forms of Cas 13, e.g., Cas13b. The variants or mutated forms ofCas protein may be catalytically inactive, e.g., have no or reducednuclease activity compared to a corresponding wildtype. In certainexamples, the variants or mutated forms of Cas protein have nickaseactivity.

Mutations of Cas13

In some cases, the present disclosure provides for mutated Cas13proteins comprising one or more modified of amino acids, wherein theamino acids: (a) interact with a guide RNA that forms a complex with themutated Cas 13 protein; (b) are in a HEPN active site, an inter-domainlinker domain, or a bridge helix domain of the mutated Cas 13 protein;or a combination thereof.

The term “corresponding amino acid” or “residue which corresponds to”refers to a particular amino acid or analogue thereof in a Cas13homologue or orthologue that is identical or functionally equivalent toan amino acid in reference Cas protein. Accordingly, as used herein,referral to an “amino acid position corresponding to amino acid position[X]” of a specified Cas 13 protein represents referral to a collectionof equivalent positions in other recognized Cas 13 and structuralhomologues and families. The mutations described herein apply to allCas13 protein that is orthologs or homologs of the referred Cas protein(e.g., PbCas13b). For example, the mutations apply to Cas13a, Cas13b,Cas13c, Cas13d, Cas13b-t1, Cas13b-t2, or Cas13b-t3.

In an aspect, the invention relates to a mutated Cas13 proteincomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): T405,H407, K457, H500, K570, K590, N634, R638, N652, N653, K655, S658, K741,K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193, R600,K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835, K836,R838, R618, D434, K431, R53, K943, R1041, Y164, R285, R287, K292, E296,N297, Q646, N647, R402, K393, N653, N652, R482, N480, D396, E397, D398,E399, K294, E400, R56, N157, H161, H452, N455, K484, N486, G566, H567,A656, V795, A796, W842, K871, E873, R874, R1068, N1069, or H1073.

PbCas13b as used herein preferably has the sequence of NCBI ReferenceSequence WP_004343973.1. It is to be understood that WP_004343973.1refers to the wild type (i.e. unmutated) PbCas13b. LshCas13a(Leptotrichia shahii Cas13a) as used herein preferably has the sequenceof NCBI Reference Sequence WP_018451595.1. It is to be understood thatWP_018451595.1 refers to the wild type (i.e. unmutated) LshCas13b. PguCas13b (Porphyromonas gulae Cas13b) as used herein preferably has thesequence of NCBI Reference Sequence WP_039434803.1. It is to beunderstood that WP_039434803.1 refers to the wild type (i.e. unmutated)Pgu Cas13b. Psp Cas13b (Prevotella sp. P5-125 Cas13b) as used hereinpreferably has the sequence of NCBI Reference Sequence WP_044065294.1.It is to be understood that WP_044065294.1 refers to the wild type (i.e.unmutated) Psp Cas13b.

In embodiments of the invention, a Type VI system comprises a mutatedCas13 effector protein according to the invention as described herein(and optionally a small accessory protein encoded upstream or downstreamof a Cas13b effector protein). In certain embodiments, the smallaccessory protein enhances the Cas13b effector's ability to target RNA.

Insights from the structure of Cas13 enables further rationalengineering to improve functionality for RNA targeting specificity, baseediting, and nucleic acid detection, etc. Based on the elucidatedcrystal structure of the Cas13 effector with its crRNA described herein,functional implications of rational engineering and mutagenesis can bepostulated, of which non-limiting mutations are exemplified in Table 6below (with reference to PbCas13b; WP_004343973.1).

TABLE 6 Residue Descrption Expected result T405 coordinates first baseof alter activity guide (U) H407 basestacking with UO possible PFSinvolvment H407Y/W/F basestacking with UO alter PFS K457 direct readoutof A31 H500 hydrogen bond with bb of G11 alter activity K570 directreadout of G25 alter activity K590 bb of U27 alter activity N634 bb ofA29 alter activity R638 bb of A28 alter activity N652 direct readout ofU2 and C36 alter activity N653 direct readout of C36 alter activity K655hydrogen bonds with bb of na 3 alter activity S658 coordinates firstbase of guide alter activity K741 direct readout of U27 alter activityK744 hydrogen bonds with bb of na 6 alter activity N756 direct readoutof C33 and C5 alter activity S757 direct readout of A32 alter activityR762 hydrogen bond with bb of G10 alter activity R791 bb of A22 alteractivity K846 hydrogen bond with bb of U18 alter activity K857 hydrogenbond with bb of C15 alter activity K870 hydrogen bond with base of U19alter activity R877 direct readout of U18 alter activity Channels K183Outerchannel rim alter activity K193 Outerchannel rim alter activityR600 Outerchannel rim alter activity K607 Outerchannel rim alteractivity K612 Outerchannel rim alter activity R614 Outerchannel rimalter activity K617 Outerchannel rim alter activity K826 Bridge helixdomain alter activity K828 Bridge helix domain alter activity K829Bridge helix domain alter activity R824 Bridge helix domain alteractivity R830 Bridge helix domain alter activity Q831 Bridge helixdomain alter activity K835 Bridge helix domain alter activity K836Bridge helix domain alter activity R838 Bridge helix domain alteractivity R618 conserved outer channel arginien alter activity D434Conserved loop alter activity K431 Conserved loop alter activity Activesite pocket 46-57 HEP1 73-79 HEP1 152-164 HEP1 1036-1046 HEP2 1064-1074HEP2 R53A/K/D/E HEP1 change in base specificity K943A/R/D/E HEP2 changein base specificity R1041A/K/D/E HEP2 change in base specificityY164A/F/W affect base stacking at active site Interdomain linker 285-299R285 central channel active pocket alter activity R287 central channelactive pocket alter activity K292 central channel active pocket alteractivity E296 central channel active pocket alter activity N297 centralchannel active pocket alter activity Other Trans active site loop alteractivity Q646 Trans active site loop alter activity N647 Trans activesite loop alter activity HEPN interface crRNA processiong R402 removecrRNA processing alter crRNA processing K393 remove crRNA processingalter crRNA processing N653 remove crRNA processing alter crRNAprocessing N652 remove crRNA processing alter crRNA processing R482remove crRNA processing alter crRNA processing N480 remove crRNAprocessing alter crRNA processing LID domain D396 hairpin with unknownfunction alter crRNA processing E397 hairpin with unknown function altercrRNA processing D398 hairpin with unknown function alter crRNAprocessing E399 hairpin with unknown function alter crRNA processingK294 IDL alter activity

Structural (Sub)Domains

In another aspect, the disclosure provides a mutated Cas13 proteincomprising one or more mutations of amino acids, wherein the aminoacids: interact with a guide RNA that forms a complex with theengineered Cas 13 protein; or are in a HEPN active site, a lid domain, ahelical domain, selected from a helical 1 or a helical 2 domain, aninter-domain linker (IDL) domain, or a bridge helix domain of themutated Cas 13 protein, or a combination thereof.

Based on the crystal structure of the Cas protein, different structuraldomains can be identified. In addition to sequence alignments, theinformation of the crystal structure and domain architecture allowscorresponding amino acids of different orthologues (e.g. Cas13borthologues) and homologues (other Cas13 proteins, such as Cas13a,Cas13c, or Cas13d) to be identified. By means of example, and withoutlimitation, the crystal structure of PbCas13b in complex with crRNA asreported herein, identifies the following structural domains (see alsoFIG. 1A): HEPN1 and HEPN2 (catalytic domains, respectively spanning fromamino acid 1 to 285 and 930 to 1127); IDL (interdomain linker, spanningfrom amino acids 286 to 301); helical domains 1 and 2, whereby helicaldomain is split in helical domain 1-1, 1-2, and 1-3 (respectivelyspanning from amino acids 302 to 374, 499 to 581, and 747 to 929), andhelical domain 2 spanning from amino acids 582 to 746; LID (spanningfrom amino acids 375 to 498). Helical domain 1, in particular helicaldomain 1-3 encompasses a bridge helix as a discernible subdomain.Accordingly, particular mutations according to the invention asdescribed herein, apart from having a specified amino acid position inthe Cas13 polypeptide can also be linked to a particular structuraldomain of the Cas13 protein. Hence a corresponding amino acid in a Cas13orthologue or homologue can have a specified amino acid position in theCas13 polypeptide as well as belong to a corresponding structural domain(see also for instance FIG. 4 as an example of corresponding amino acidsin HEPN1 and HEPN2 of Cas13a and Cas13b). Mutations may be identified bylocations in structural (sub) domains, by position corresponding toamino acids of a particular Cas13 protein (e.g. PbCas13b), byinteractions with a guide RNA, or a combination thereof.

The types of mutations can be conservative mutations or non-conservativemutations. In certain preferred embodiments, the amino acid which ismutated is mutated into alanine (A). In certain preferred embodiments,if the amino acid to be mutated is an aromatic amino acid, it is mutatedinto alanine or another aromatic amino acid (e.g. H, Y, W, or F). Incertain preferred embodiments, if the amino acid to be mutated is acharged amino acid, it is mutated into alanine or another charged aminoacid (e.g. H, K, R, D, or E). In certain preferred embodiments, if theamino acid to be mutated is a charged amino acid, it is mutated intoalanine or another charged amino acid having the same charge. In certainpreferred embodiments, if the amino acid to be mutated is a chargedamino acid, it is mutated into alanine or another charged amino acidhaving the opposite charge.

The invention also provides for methods and compositions wherein one ormore amino acid residues of the effector protein may be modified e.g.,an engineered or non-naturally-occurring effector protein or Cas13. Inan embodiment, the modification may comprise mutation of one or moreamino acid residues of the effector protein. The one or more mutationsmay be in one or more catalytically active domains of the effectorprotein, or a domain interacting with the crRNA (such as the guidesequence or direct repeat sequence). The effector protein may havereduced or abolished nuclease activity or alternatively increasednuclease activity compared with an effector protein lacking said one ormore mutations. The effector protein may not direct cleavage of the RNAstrand at the target locus of interest. In a preferred embodiment, theone or more mutations may comprise two mutations. In a preferredembodiment the one or more amino acid residues are modified in a Cas13beffector protein, e.g., an engineered or non-naturally-occurringeffector protein or Cas13b. In some cases, the CRISPR-Cas proteincomprises one or more mutations in the helical domain.

The Cas13 protein herein may comprise one or more mutations. In somecases, the Cas13 protein comprises one or more mutations of amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): T405, H407, K457, H500, K570, K590, N634, R638, N652, N653,K655, S658, K741, K744, N756, S757, R762, R791, K846, K857, K870, R877,K183, K193, R600, K607, K612, R614, K617, K826, K828, K829, R824, R830,Q831, K835, K836, R838, R618, D434, K431, R53, K943, R1041, Y164, R285,R287, K292, E296, N297, Q646, N647, R402, K393, N653, N652, R482, N480,D396, E397, D398, E399, K294, E400, R56, N157, H161, H452, N455, K484,N486, G566, H567, A656, V795, A796, W842, K871, E873, R874, R1068,N1069, or H1073.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): H407, K457, H500, K570, K590, N634, R638,N652, N653, K655, S658, K741, K744, N756, S757, R762, R791, K846, K857,K870, R877, K183, K193, R600, K607, K612, R614, K617, K826, K828, K829,R824, R830, Q831, K835, K836, R838, R618, D434, K431, R53, K943, R1041,Y164, R285, R287, K292, E296, N297, Q646, N647, R402, K393, N653, N652,R482, N480, D396, E397, D398, E399, K294, E400, R56, N157, H161, H452,N455, K484, N486, G566, H567, W842, K871, E873, R874, R1068, N1069, orH1073.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): T405, H407, K457, H500, K570, K590, N634,R638, N652, N653, K655, S658, K741, K744, N756, S757, R762, R791, K846,K857, K870, R877, K183, K193, R600, K607, K612, R614, K617, K826, K828,K829, R824, R830, Q831, K835, K836, R838, R618, D434, K431, R53, K943,R1041, Y164, R285, R287, K292, E296, N297, Q646, N647, R402, K393, N653,N652, R482, N480, D396, E397, D398, E399, K294, or E400.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K393, R402, N482, T405, H407, S658, N653,A656, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, V795, A796, R791, G566, K590, R638, H452, S757, N756,N486, K484, N480, K457, K741, R56, N157, H161, R1068, N1069, or H1073.In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: K393,R402, N482, H407, S658, N653, K655, N652, H567, N455, H500, K871, K857,K870, W842, E873, R877, K846, R874, R762, R791, G566, K590, R638, H452,S757, N756, N486, K484, N480, K457, K741, R56, N157, H161, R1068, N1069,or H1073.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: W842,K846, K870, E873, or R877. In some cases, the Cas13 protein comprises inhelical domain 1 one or more mutations of an amino acid corresponding tothe following amino acids in helical domain 1 of PbCas13b: W842, K846,K870, E873, or R877. In some cases, the Cas13 protein comprises inhelical domain 1-3 one or more mutations of an amino acid correspondingto the following amino acids in helical domain 1-3 of PbCas13b: W842,K846, K870, E873, or R877. In some cases, the Cas13 protein comprises inthe helical bridge domain one or more mutations of an amino acidcorresponding to the following amino acids in the helical bridge domainof PbCas13b: W842, K846, K870, E873, or R877. In some cases, the Cas13protein comprises one or more mutations of an amino acid correspondingto the following amino acids of PbCas13b: K393, R402, N480, N482, N652,or N653. In some cases, the Cas13 protein comprises one or moremutations of an amino acid corresponding to the following amino acids ofPbCas13b: K393, R402, N480, or N482. In some cases, the Cas13 proteincomprises in the LID domain one or more mutations of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N480, or N482. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of PbCas13b: N652 or N653. In some cases, theCas13 protein comprises in helical domain 2 one or more mutations of anamino acid corresponding to the following amino acids in helical domain2 of PbCas13b: N652 or N653.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: T405,H407, S658, N653, A656, K655, N652, H567, N455, H500, K871, K857, K870,W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590, R638,H452, S757, N756, N486, K484, N480, K457, or K741. In some cases, theCas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of PbCas13b: H407, S658,N653, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, R791, G566, K590, R638, H452, S757, N756, N486, K484,N480, K457, or K741. In some cases, the Cas13 protein comprises one ormore mutations of an amino acid corresponding to the following aminoacids of PbCas13b: S658, N653, A656, K655, N652, H567, H500, K871, K857,K870, W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590,R638, S757, N756, or K741. In some cases, the Cas13 protein comprises ina helical domain one or more mutations of an amino acid corresponding tothe following amino acids in a helical domain of PbCas13b: S658, N653,A656, K655, N652, H567, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, V795, A796, R791, G566, K590, R638, S757, N756, or K741.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: H567,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, V795, A796,R791, G566, S757, or N756. In some cases, the Cas13 protein comprises inhelical domain 1 one or more mutations of an amino acid corresponding tothe following amino acids in helical domain 1 of PbCas13b: H567, H500,K871, K857, K870, W842, E873, R877, K846, R874, R762, V795, A796, R791,G566, S757, or N756. In some cases, the Cas13 protein comprises one ormore mutations of an amino acid corresponding to the following aminoacids of PbCas13b: H567, H500, R762, V795, A796, R791, G566, S757, orN756. In some cases, the Cas13 protein comprises in helical domain 1 oneor more mutations of an amino acid corresponding to the following aminoacids in helical domain 1 of PbCas13b: H567, H500, R762, V795, A796,R791, G566, S757, or N756.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: K871,K857, K870, W842, E873, R877, K846, or R874. In some cases, the Cas13protein comprises in the helical bridge domain one or more mutations ofan amino acid corresponding to the following amino acids in the helicalbridge domain of PbCas13b: K871, K857, K870, W842, E873, R877, K846, orR874. In some cases, the Cas13 protein comprises one or more mutationsof an amino acid corresponding to the following amino acids of PbCas13b:H567, H500, or G566. In some cases, the Cas13 protein comprises inhelical domain 1-2 one or more mutations of an amino acid correspondingto the following amino acids in helical domain 1-2 of PbCas13b: H567,H500, or G566. In some cases, the Cas13 protein comprises one or moremutations of an amino acid corresponding to the following amino acids ofPbCas13b: K871, K857, K870, W842, E873, R877, K846, R874, R762, V795,A796, R791, S757, or N756. In some cases, the Cas13 protein comprises inhelical domain 1-3 one or more mutation of an amino acid correspondingto the following amino acids in helical domain 1-3 of PbCas13b: K871,K857, K870, W842, E873, R877, K846, R874, R762, V795, A796, R791, S757,or N756. In some cases, the Cas13 protein comprises one or moremutations of an amino acid corresponding to the following amino acids ofPbCas13b: R762, V795, A796, R791, S757, or N756. In some cases, theCas13 protein comprises in helical domain 1-3 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1-3 of PbCas13b: R762, V795, A796, R791, S757, or N756. In some cases,the Cas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of PbCas13b: S658, N653,A656, K655, N652, K590, R638, or K741. In some cases, the Cas13 proteincomprises in helical domain 2 one or more mutations of an amino acidcorresponding to the following amino acids in helical domain 2 ofPbCas13b: S658, N653, A656, K655, N652, K590, R638, or K741. In somecases, the Cas13 protein comprises one or more mutations of an aminoacid corresponding to the following amino acids of PbCas13b: T405, H407,N486, K484, N480, H452, N455, or K457. In some cases, the Cas13 proteincomprises in the LID domain one or more mutations of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: T405, H407, N486, K484, N480, H452, N455, or K457. In somecases, the Cas13 protein comprises one or more mutations of an aminoacid corresponding to the following amino acids of PbCas13b: S658, N653,K655, N652, H567, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, R791, G566, K590, R638, S757, N756, or K741.

In some cases, the Cas13 protein comprises in a helical domain one ormore mutations of an amino acid corresponding to the following aminoacids in a helical domain of PbCas13b: S658, N653, K655, N652, H567,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791, G566,K590, R638, S757, N756, or K741. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, K871, K857, K870, W842,E873, R877, K846, R874, R762, R791, G566, S757, or N756. In some cases,the Cas13 protein comprises in helical domain 1 one or more mutations ofan amino acid corresponding to the following amino acids in helicaldomain 1 of PbCas13b: H567, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, R791, G566, S757, or N756. In some cases, the Cas13protein comprises one or more mutations of an amino acid correspondingto the following amino acids of PbCas13b: H567, H500, R762, R791, G566,S757, or N756. In some cases, the Cas13 protein comprises in helicaldomain 1 one or more mutations of an amino acid corresponding to thefollowing amino acids in helical domain 1 of PbCas13b: H567, H500, R762,R791, G566, S757, or N756. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of PbCas13b: K871, K857, K870, W842, E873, R877, K846, R874,R762, R791, S757, or N756. In some cases, the Cas13 protein comprises inhelical domain 1-3 one or more mutations of an amino acid correspondingto the following amino acids in helical domain 1-3 of PbCas13b: K871,K857, K870, W842, E873, R877, K846, R874, R762, R791, S757, or N756. Insome cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: R762,R791, S757, or N756. In some cases, the Cas13 protein comprises inhelical domain 1-3 one or more mutations of an amino acid correspondingto the following amino acids in helical domain 1-3 of PbCas13b: R762,R791, S757, or N756.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: S658,N653, K655, N652, K590, R638, or K741. In some cases, the Cas13 proteincomprises in helical domain 2 one or more mutations of an amino acidcorresponding to the following amino acids in helical domain 2 ofPbCas13b: S658, N653, K655, N652, K590, R638, or K741.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: H407,N486, K484, N480, H452, N455, or K457. In some cases, the Cas13 proteincomprises in the LID domain one or more mutations of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: H407, N486, K484, N480, H452, N455, or K457.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: R56,N157, H161, R1068, N1069, or H1073. In some cases, the Cas13 proteincomprises in a HEPN domain one or more mutations of an amino acidcorresponding to the following amino acids in a HEPN domain of PbCas13b:R56, N157, H161, R1068, N1069, or H1073.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: R56,N157, or H161. In some cases, the Cas13 protein comprises in HEPN domain1 one or more mutations of an amino acid corresponding to the followingamino acids in HEPN domain 1 of PbCas13b: R56, N157, or H161. In somecases, the Cas13 protein comprises one or more mutations of an aminoacid corresponding to the following amino acids of PbCas13b: R1068,N1069, or H1073. In some cases, the Cas13 protein comprises in HEPNdomain 2 one or more mutations of an amino acid corresponding to thefollowing amino acids in HEPN domain 2 of PbCas13b: R1068, N1069, orH1073.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: K393,R402, N482, T405, H407, N486, K484, N480, H452, N455, or K457. In somecases, the Cas13 protein comprises in the LID domain one or moremutations of an amino acid corresponding to the following amino acids inthe LID domain of PbCas13b: K393, R402, N482, T405, H407, N486, K484,N480, H452, N455, or K457. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of PbCas13b: K393, R402, N482, H407, N486, K484, N480, H452,N455, or K457. In some cases, the Cas13 protein comprises in the LIDdomain one or more mutations of an amino acid corresponding to thefollowing amino acids in the LID domain of PbCas13b: K393, R402, N482,H407, N486, K484, N480, H452, N455, or K457.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: T405,H407, S658, N653, A656, K655, N652, H567, N455, H500, K871, K857, K870,W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590, R638,H452, S757, N756, N486, K484, N480, K457, K741, K393, R402, or N482. Insome cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: H407,S658, N653, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873,R877, K846, R874, R762, R791, G566, K590, R638, H452, S757, N756, N486,K484, N480, K457, K741, K393, R402, or N482.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: S658,N653, A656, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873,R877, K846, R874, R762, V795, A796, R791, G566, K590, R638, H452, S757,N756, N486, K484, N480, K457, or K741. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, K655, N652, H567, N455,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791, G566,K590, R638, H452, S757, N756, N486, K484, N480, K457, or K741.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: N486,K484, N480, H452, N455, or K457. In some cases, the Cas13 proteincomprises in the LID domain one or more mutations of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: N486, K484, N480, H452, N455, or K457. In some cases, theCas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of PbCas13b: K393, R402,N482, N486, K484, N480, H452, N455, or K457. In some cases, the Cas13protein comprises in the LID domain one or more mutations of an aminoacid corresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N482, N486, K484, N480, H452, N455, or K457.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of PbCas13b: S658,N653, A656, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873,R877, K846, R874, R762, V795, A796, R791, G566, K590, R638, H452, S757,N756, N486, K484, N480, K457, K741, K393, R402, or N482. In some cases,the Cas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of PbCas13b: S658, N653,K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, R791, G566, K590, R638, H452, S757, N756, N486, K484, N480,K457, K741, K393, R402, or N482.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, K943, or R1041. In some cases, theCas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53 or Y164.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K943 or R1041. In some cases, the Cas13protein comprises in a HEPN domain one or more mutations of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K943, or R1041. In somecases, the Cas13 protein comprises in HEPN domain 1 one or moremutations of an amino acid corresponding to the following amino acids inHEPN domain 1 of Prevotella buccae Cas13b (PbCas13b): R53 or Y164. Insome cases, the Cas13 protein comprises in HEPN domain 2 one or moremutations of an amino acid corresponding to the following amino acids inHEPN domain 2 of Prevotella buccae Cas13b (PbCas13b): K943 or R1041.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, K943, R1041, R56, N157, H161,R1068, N1069, or H1073.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, R56, N157, or H161. In some cases,the Cas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K943, R1041, R1068, N1069, or H1073. In some cases, theCas13 protein comprises in a HEPN domain one or more mutations of anamino acid corresponding to the following amino acids in a HEPN domainof Prevotella buccae Cas13b (PbCas13b): R53, Y164, K943, R1041, R56,N157, H161, R1068, N1069, or H1073. In some cases, the Cas13 proteincomprises in HEPN domain 1 one or more mutations of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, R56, N157, or H161. Insome cases, the Cas13 protein comprises in HEPN domain 2 one or moremutations of an amino acid corresponding to the following amino acids inHEPN domain 2 of Prevotella buccae Cas13b (PbCas13b): K943, R1041,R1068, N1069, or H1073.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943, or R1041. In somecases, the Cas13 protein comprises one or more mutations of an aminoacid corresponding to the following amino acids of Prevotella buccaeCas13b (PbCas13b): R53, Y164, K183, or K193. In some cases, the Cas13protein comprises one or more mutations of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):K943 or R1041. In some cases, the Cas13 protein comprises in a HEPNdomain one or more mutations of an amino acid corresponding to thefollowing amino acids in a HEPN domain of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K183, K193, K943, or R1041.

In some cases, the Cas13 protein comprises in HEPN domain 1 one or moremutations of an amino acid corresponding to the following amino acids inHEPN domain 1 of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K183,or K193. In some cases, the Cas13 protein comprises in HEPN domain 2 oneor more mutations of an amino acid corresponding to the following aminoacids in HEPN domain 2 of Prevotella buccae Cas13b (PbCas13b): K943 orR1041. In some cases, the Cas13 protein comprises one or more mutationsof an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943, R1041,R56, N157, H161, R1068, N1069, or H1073. In some cases, the Cas13protein comprises one or more mutations of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):R53, Y164, K183, K193, R56, N157, or H161. In some cases, the Cas13protein comprises one or more mutations of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):K943, R1041, R1068, N1069, or H1073. In some cases, the Cas13 proteincomprises in a HEPN domain one or more mutations of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943, R1041,R56, N157, H161, R1068, N1069, or H1073. In some cases, the Cas13protein comprises in HEPN domain 1 one or more mutations of an aminoacid corresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, R56, N157,or H161.

In some cases, the Cas13 protein comprises in HEPN domain 2 one or moremutations of an amino acid corresponding to the following amino acids inHEPN domain 2 of Prevotella buccae Cas13b (PbCas13b): K943, R1041,R1068, N1069, or H1073. In some cases, the Cas13 protein comprises oneor more mutations of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): K183 or K193. In somecases, the Cas13 protein comprises in HEPN domain 1 one or moremutations of an amino acid corresponding to the following amino acids inHEPN domain 1 of Prevotella buccae Cas13b (PbCas13b): K183 or K193.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, K943, or R1041. In some cases, theCas13 protein comprises in a HEPN domain one or more mutations of anamino acid corresponding to the following amino acids in a HEPN domainof Prevotella buccae Cas13b (PbCas13b): R53, Y164, K943, or R1041. Insome cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, K943, or R1041; preferably R53A, R53K,R53D, or R53E; K943A, K943R, K943D, or K943E; or R1041A, R1041K, R1041D,or R1041E.

In some cases, the Cas13 protein comprises in a HEPN domain one or moremutations of an amino acid corresponding to the following amino acids ina HEPN domain of Prevotella buccae Cas13b (PbCas13b): R53, K943, orR1041; preferably R53A, R53K, R53D, or R53E; K943A, K943R, K943D, orK943E; or R1041A, R1041K, R1041D, or R1041E.

In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid Y164 of Prevotella buccae Cas13b (PbCas13b),preferably Y164A, Y164F, or Y164W. In some cases, the Cas13 proteincomprises HEPN domain 1 a mutations of an amino acid corresponding toamino acid Y164 HEPN domain 1 of Prevotella buccae Cas13b (PbCas13b),preferably Y164A, Y164F, or Y164W. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): T405,H407, K457, D434, K431, R402, K393, R482, N480, D396, E397, D398, orE399.

In some cases, the Cas13 protein comprises in the LID domain one or moremutations of an amino acid corresponding to the following amino acids inthe LID domain of Prevotella buccae Cas13b (PbCas13b): T405, H407, K457,D434, K431, R402, K393, R482, N480, D396, E397, D398, or E399. In somecases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid H407 of Prevotella buccae Cas13b (PbCas13b),preferably H407Y, H407W, or H407F. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R402,K393, R482, N480, D396, E397, D398, or E399. In some cases, the Cas13protein comprises in the LID domain one or more mutations of an aminoacid corresponding to the following amino acids in the LID domain ofPrevotella buccae Cas13b (PbCas13b): R402, K393, R482, N480, D396, E397,D398, or E399. In some cases, the Cas13 protein comprises one or moremutations of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): K457, D434, or K431. In some cases,the Cas13 protein comprises in the LID domain one or more mutations ofan amino acid corresponding to the following amino acids in the LIDdomain of Prevotella buccae Cas13b (PbCas13b): K457, D434, or K431.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): H500, K570, K590, N634, R638, N652, N653,K655, S658, K741, K744, N756, S757, R762, R791, K846, K857, K870, R877,R600, K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835,K836, R838, R618, Q646, N647, N653, or N652. In some cases, the Cas13protein comprises in a helical domain one or more mutations of an aminoacid corresponding to the following amino acids in a helical domain ofPrevotella buccae Cas13b (PbCas13b): H500, K570, K590, N634, R638, N652,N653, K655, S658, K741, K744, N756, S757, R762, R791, K846, K857, K870,R877, R600, K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831,K835, K836, R838, R618, Q646, N647, N653, or N652. In some cases, theCas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500, K570, N756, S757, R762, R791, K846, K857, K870, R877,K826, K828, K829, R824, R830, Q831, K835, K836, or R838. In some cases,the Cas13 protein comprises in helical domain 1 one or more mutations ofan amino acid corresponding to the following amino acids in helicaldomain 1 of Prevotella buccae Cas13b (PbCas13b): H500, K570, N756, S757,R762, R791, K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831,K835, K836, or R838.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): H500, K570, N756, S757, R762, or R791. In somecases, the Cas13 protein comprises in helical domain 1 one or moremutations of an amino acid corresponding to the following amino acids inhelical domain 1 of Prevotella buccae Cas13b (PbCas13b): H500, K570,N756, S757, R762, or R791. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): K846, K857, K870,R877, K826, K828, K829, R824, R830, Q831, K835, K836, or R838. In somecases, the Cas13 protein comprises in the helical bridge domain one ormore mutations of an amino acid corresponding to the following aminoacids in the helical bridge domain of Prevotella buccae Cas13b(PbCas13b): K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831,K835, K836, or R838. In some cases, the Cas13 protein comprises one ormore mutations of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): H500 or K570. In somecases, the Cas13 protein comprises in helical domain 1-2 one or moremutations of an amino acid corresponding to the following amino acids inhelical domain 1-2 of Prevotella buccae Cas13b (PbCas13b): H500 or K570.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): N756, S757, R762, R791, K846, K857, K870,R877, K826, K828, K829, R824, R830, Q831, K835, K836, or R838. In somecases, the Cas13 protein comprises in helical domain 1-3 one or moremutations of an amino acid corresponding to the following amino acids inhelical domain 1-3 of Prevotella buccae Cas13b (PbCas13b): N756, S757,R762, R791, K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831,K835, K836, or R838. In some cases, the Cas13 protein comprises one ormore mutations of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): N756, S757, R762, or R791.In some cases, the Cas13 protein comprises in helical domain 1-3 one ormore mutations of an amino acid corresponding to the following aminoacids in helical domain 1-3 of Prevotella buccae Cas13b (PbCas13b):N756, S757, R762, or R791. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): N756, S757, R762,R791, K846, K857, K870, or R877. In some cases, the Cas13 proteincomprises in helical domain 1-3 one or more mutations of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPrevotella buccae Cas13b (PbCas13b): N756, S757, R762, R791, K846, K857,K870, or R877.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K826, K828, K829, R824, R830, Q831, K835,K836, or R838. In some cases, the Cas13 protein comprises in helicaldomain 1-3 one or more mutations of an amino acid corresponding to thefollowing amino acids in helical domain 1-3 of Prevotella buccae Cas13b(PbCas13b): K826, K828, K829, R824, R830, Q831, K835, K836, or R838.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K590, N634, R638, N652, N653, K655, S658,K741, K744, R600, K607, K612, R614, K617, R618, Q646, N647, N653, orN652. In some cases, the Cas13 protein comprises in helical domain 2 oneor more mutations of an amino acid corresponding to the following aminoacids in helical domain 2 of Prevotella buccae Cas13b (PbCas13b): K590,N634, R638, N652, N653, K655, S658, K741, K744, R600, K607, K612, R614,K617, R618, Q646, N647, N653, or N652. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): Q646 orN647. In some cases, the Cas13 protein comprises in helical domain 2 oneor more mutations of an amino acid corresponding to the following aminoacids in helical domain 2 of Prevotella buccae Cas13b (PbCas13b): Q646or N647. In some cases, the Cas13 protein comprises one or moremutations of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): N653 or N652. In some cases, theCas13 protein comprises in helical domain 2 one or more mutations of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): N653 or N652. In some cases,the Cas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K590, N634, R638, N652, N653, K655, S658, K741, or K744. Insome cases, the Cas13 protein comprises in helical domain 2 one or moremutations of an amino acid corresponding to the following amino acids inhelical domain 2 of Prevotella buccae Cas13b (PbCas13b): K590, N634,R638, N652, N653, K655, S658, K741, or K744. In some cases, the Cas13protein comprises one or more mutations of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):R600, K607, K612, R614, K617, or R618. In some cases, the Cas13 proteincomprises in helical domain 2 one or more mutations of an amino acidcorresponding to the following amino acids in helical domain 2 ofPrevotella buccae Cas13b (PbCas13b): R600, K607, K612, R614, K617, orR618. In some cases, the Cas13 protein comprises one or more mutationsof an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): R285, R287, K292, E296, N297, orK294. In some cases, the Cas13 protein comprises in the IDL domain oneor more mutations of an amino acid corresponding to the following aminoacids in the IDL domain of Prevotella buccae Cas13b (PbCas13b): R285,R287, K292, E296, N297, or K294. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R285,K292, E296, or N297. In some cases, the Cas13 protein comprises in theIDL domain one or more mutations of an amino acid corresponding to thefollowing amino acids in the IDL domain of Prevotella buccae Cas13b(PbCas13b): R285, K292, E296, or N297.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): T405, H500, K570, K590, N634, R638, N652,N653, K655, S658, K741, K744, N756, S757, R762, R791, K846, K857, K870,R877, K183, K193, R600, K607, K612, R614, K617, K826, K828, K829, R824,R830, Q831, K835, K836, R838, R618, D434, K431, R285, R287, K292, E296,N297, Q646, N647, or K294. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): R402, K393, N653,N652, R482, N480, D396, E397, D398, or E399. In some cases, the Cas13protein comprises one or more mutations of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):R53, K655, R762, or R1041; preferably R53A or R53D; K655A; R762A; orR1041E or R1041D. In some cases, the Cas13 protein comprises one or moremutations of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): N297, E296, K292, or R285;preferably N297A, E296A, K292A, or R285A. In some cases, the Cas13protein comprises in (e.g., the central channel of) the IDL domain oneor more mutations of an amino acid corresponding to the following aminoacids in (e.g., the central channel of) the IDL domain of Prevotellabuccae Cas13b (PbCas13b): N297, E296, K292, or R285; preferably N297A,E296A, K292A, or R285A. In some cases, the Cas13 protein comprises oneor more mutations of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): Q831, K836, R838, N652,N653, R830, K655 or R762; preferably Q831A, K836A, R838A, N652A, N653A,R830A, K655A, or R762A.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): N652, N653, R830, K655 or R762; preferablyN652A, N653A, R830A, K655A, or R762A. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K655 orR762; preferably K655A or R762A. In some cases, the Cas13 proteincomprises in a helical domain one or more mutations of an amino acidcorresponding to the following amino acids in a helical domain ofPrevotella buccae Cas13b (PbCas13b): Q831, K836, R838, N652, N653, R830,K655 or R762; preferably Q831A, K836A, R838A, N652A, N653A, R830A,K655A, or R762A. In some cases, the Cas13 protein comprises a helicaldomain one or more mutations of an amino acid corresponding to thefollowing amino acids a helical domain of Prevotella buccae Cas13b(PbCas13b): N652, N653, R830, K655 or R762; preferably N652A, N653A,R830A, K655A, or R762A.

In some cases, the Cas13 protein comprises in helical domain 2 one ormore mutations of an amino acid corresponding to the following aminoacids in helical domain 2 of Prevotella buccae Cas13b (PbCas13b): K655or R762; preferably K655A or R762A. In some cases, the Cas13 proteincomprises one or more mutations of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R614,K607, K193, K183 or R600; preferably R614A, K607A, K193A, K183A orR600A. In some cases, the Cas13 protein comprises in the trans-subunitloop of helical domain 2 one or more mutations of an amino acidcorresponding to the following amino acids in the trans-subunit loop ofhelical domain 2 of Prevotella buccae Cas13b (PbCas13b): Q646 or N647;preferably Q646A or N647A. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): R53 or R1041;preferably R53A or R53D, or R1041E or R1041D. In some cases, the Cas13protein comprises in a HEPN domain one or more mutations of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53 or R1041; preferably R53A orR53D, or R1041E or R1041D. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): K457, D397, E398,D399, E400, T405, H407 or D434; preferably D397A, E398A, D399A, E400A,T405A, H407A, H407W, H407Y, H407F or D434A. In some cases, the Cas13protein comprises in the LID domain one or more mutations of an aminoacid corresponding to the following amino acids in the LID domain ofPrevotella buccae Cas13b (PbCas13b): K457, D397, E398, D399, E400, T405,H407 or D434; preferably D397A, E398A, D399A, E400A, T405A, H407A,H407W, H407Y, H407F or D434A.

In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid T405 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid H407 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K457 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid H500 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K570 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K590 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N634 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R638 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N652 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N653 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K655 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid S658 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K741 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K744 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N756 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid S757 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R762 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R791 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K846 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K857 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K870 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R877 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K183 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K193 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R600 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K607 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K612 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R614 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K617 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K826 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K828 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K829 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R824 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R830 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid Q831 of Prevotella buccae Cas13b (PbCas13b).

In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K835 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K836 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R838 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R618 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid D434 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K431 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R53 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K943 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R1041 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid Y164 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid R285 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid R287 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid K292 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid E296 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid N297 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid Q646 of Prevotella buccae Cas13b(PbCas13b).

In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N647 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R402 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K393 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N653 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N652 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R482 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N480 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid D396 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid E397 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid D398 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid E399 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K294 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid E400 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R56 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N157 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid H161 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid H452 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N455 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K484 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid N486 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid G566 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid H567 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid A656 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid V795 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid A796 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid W842 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid K871 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid E873 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R874 of Prevotella buccae Cas13b (PbCas13b).In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid R1068 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid N1069 of Prevotella buccae Cas13b(PbCas13b). In some cases, the Cas13 protein comprises a mutation of anamino acid corresponding to amino acid H1073 of Prevotella buccae Cas13b(PbCas13b).

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Leptotrichiashahii Cas13a (LshCas13a): R597, N598, H602, R1278, N1279, or H1283. Thepresent disclosure also includes a mutated Cas13 protein comprising oneor more mutations of an amino acid corresponding to the following aminoacids of Leptotrichia shahii Cas13a (LshCas13a): R597, N598, H602,R1278, N1279, or H1283. In some cases, the Cas13 protein comprises in aHEPN domain one or more mutations of an amino acid corresponding to thefollowing amino acids in a HEPN domain of Leptotrichia shahii Cas13a(LshCas13a): R597, N598, H602, R1278, N1279, or H1283. In some cases,the Cas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of Leptotrichia shahii Cas13a(LshCas13a): R597, N598, or H602. In some cases, the Cas13 proteincomprises in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofLeptotrichia shahii Cas13a (LshCas13a): R597, N598, or H602. In somecases, the Cas13 protein comprises one or more mutations of an aminoacid corresponding to the following amino acids of Leptotrichia shahiiCas13a (LshCas13a): R1278, N1279, or H1283. In some cases, the Cas13protein comprises in HEPN domain 2 one or more mutations of an aminoacid corresponding to the following amino acids in HEPN domain 2 ofLeptotrichia shahii Cas13a (LshCas13a): R1278, N1279, or H1283. In somecases, the Cas13 protein comprises one or more mutations of an aminoacid corresponding to the following amino acids of Porphyromonas gulaeCas13b (PguCas13b): R146, H151, R1116, or H1121. In some cases, theCas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R146, H151, R1116, or H1121. In some cases, the Cas13protein comprises in a HEPN domain one or more mutations of an aminoacid corresponding to the following amino acids in a HEPN domain ofPorphyromonas gulae Cas13b (PguCas13b): R146, H151, R1116, or H1121.

In some cases, the Cas13 protein comprises one or more mutations of anamino acid corresponding to the following amino acids of Porphyromonasgulae Cas13b (PguCas13b): R146 or H151. In some cases, the Cas13 proteincomprises in HEPN domain 1 one or more mutations of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofPorphyromonas gulae Cas13b (PguCas13b): R146 or H151. In some cases, theCas13 protein comprises one or more mutations of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R1116 or H1121. In some cases, the Cas13 protein comprisesin HEPN domain 2 one or more mutations of an amino acid corresponding tothe following amino acids in HEPN domain 2 of Porphyromonas gulae Cas13b(PguCas13b): R1116 or H1121. In some cases, the Cas13 protein comprisesone or more mutations of an amino acid corresponding to the followingamino acids of Prevotella sp. P5-125 Cas13b (PspCas13b): H133 or H1058.The present disclosure also provides a mutated Cas13 protein comprisingone or more mutations of an amino acid corresponding to the followingamino acids of Prevotella sp. P5-125 Cas13b (PspCas13b): H133 or H1058.In some cases, the Cas13 protein comprises in a HEPN domain one or moremutations of an amino acid corresponding to the following amino acids ina HEPN domain of Prevotella sp. P5-125 Cas13b (PspCas13b): H133 orH1058.

In some cases, the Cas13 protein comprises a mutation of an amino acidcorresponding to amino acid H133 of Prevotella sp. P5-125 Cas13b(PspCas13b). In some cases, the Cas13 protein comprises in HEPN domain 1a mutation of an amino acid corresponding to amino acid H133 in HEPNdomain 1 of Prevotella sp. P5-125 Cas13b (PspCas13b). In some cases, theCas13 protein comprises a mutation of an amino acid corresponding toamino acid H1058 of Prevotella sp. P5-125 Cas13b (PspCas13b). In somecases, the Cas13 protein comprises in HEPN domain 2 a mutation of anamino acid corresponding to the amino acid H1058 in HEPN domain 2 ofPrevotella sp. P5-125 Cas13b (PspCas13b).

The CRISPR-Cas protein herein may comprise one or more amino acidsmutated. In some embodiments, the amino acid is mutated to A, P, or V,preferably A. In some embodiments, the amino acid is mutated to ahydrophobic amino acid. In some embodiments, the amino acid is mutatedto an aromatic amino acid. In some embodiments, the amino acid ismutated to a charged amino acid. In some embodiments, the amino acid ismutated to a positively charged amino acid. In some embodiments, theamino acid is mutated to a negatively charged amino acid. In someembodiments, the amino acid is mutated to a polar amino acid. In someembodiments, the amino acid is mutated to an aliphatic amino acid.

The present disclosure also provides for methods of altering activity ofCRISPR-Cas proteins. In some examples, such methods comprise identifyingone or more candidate amino acids in the Cas13 protein based on athree-dimensional structure of at least a portion of the Cas 13 protein,wherein the one or more candidate amino acids interact with a guide RNAthat forms a complex with the Cas13 protein; or are in a HEPN activesite, an inter-domain linker domain, or a bridge helix domain of theCas13 protein; and mutating the one or more candidate amino acidsthereby generating a mutated Cas13 protein, wherein activity the mutatedCas13 protein is different than the Cas13 protein.

Destabilized Cas13 and Fusion Proteins

In certain embodiments, the effector protein according to the inventionas described herein is associated with or fused to a destabilizationdomain (DD). In some embodiments, the DD is ER50. A correspondingstabilizing ligand for this DD is, in some embodiments, 4HT. As such, insome embodiments, one of the at least one DDs is ER50 and a stabilizingligand therefor is 4HT or CMP8. In some embodiments, the DD is DHFR50. Acorresponding stabilizing ligand for this DD is, in some embodiments,TMP. As such, in some embodiments, one of the at least one DDs is DHFR50and a stabilizing ligand therefor is TMP. In some embodiments, the DD isER50. A corresponding stabilizing ligand for this DD is, in someembodiments, CMP8. CMP8 may therefore be an alternative stabilizingligand to 4HT in the ER50 system. While it may be possible that CMP8 and4HT can/should be used in a competitive matter, some cell types may bemore susceptible to one or the other of these two ligands, and from thisdisclosure and the knowledge in the art the skilled person can use CMP8and/or 4HT.

In some embodiments, one or two DDs may be fused to the N-terminal endof the Cas13 with one or two DDs fused to the C-terminal of the Cas13.In some embodiments, the at least two DDs are associated with the Cas13and the DDs are the same DD, i.e. the DDs are homologous. Thus, both (ortwo or more) of the DDs could be ER50 DDs. This is preferred in someembodiments. Alternatively, both (or two or more) of the DDs could beDHFR50 DDs. This is also preferred in some embodiments. In someembodiments, the at least two DDs are associated with the Cas13 and theDDs are different DDs, i.e. the DDs are heterologous. Thus, one of theDDS could be ER50 while one or more of the DDs or any other DDs could beDHFR50. Having two or more DDs which are heterologous may beadvantageous as it would provide a greater level of degradation control.A tandem fusion of more than one DD at the N or C-term may enhancedegradation; and such a tandem fusion can be, for exampleER50-ER50-Cas13 or DHFR-DHFR-Cas13 It is envisaged that high levels ofdegradation would occur in the absence of either stabilizing ligand,intermediate levels of degradation would occur in the absence of onestabilizing ligand and the presence of the other (or another)stabilizing ligand, while low levels of degradation would occur in thepresence of both (or two of more) of the stabilizing ligands. Controlmay also be imparted by having an N-terminal ER50 DD and a C-terminalDHFR50 DD.

In some embodiments, the fusion of the Cas13 with the DD comprises alinker between the DD and the Cas13. In some embodiments, the linker isa GlySer linker. In some embodiments, the DD-Cas13 further comprises atleast one Nuclear Export Signal (NES). In some embodiments, the DD-Cas13comprises two or more NESs. In some embodiments, the DD-Cas13 comprisesat least one Nuclear Localization Signal (NLS). This may be in additionto an NES. In some embodiments, the Cas13 comprises or consistsessentially of or consists of a localization (nuclear import or export)signal as, or as part of, the linker between the Cas13 and the DD. HA orFlag tags are also within the ambit of the invention as linkers.Applicants use NLS and/or NES as linker and also use Glycine Serinelinkers as short as GS up to (GGGGS)3.

Destabilizing domains have general utility to confer instability to awide range of proteins; see, e.g., Miyazaki, J Am Chem Soc. Mar. 7,2012; 134(9): 3942-3945, incorporated herein by reference. CMP8 or4-hydroxytamoxifen can be destabilizing domains. More generally, Atemperature-sensitive mutant of mammalian DHFR (DHFRts), a destabilizingresidue by the N-end rule, was found to be stable at a permissivetemperature but unstable at 37° C. The addition of methotrexate, ahigh-affinity ligand for mammalian DHFR, to cells expressing DHFRtsinhibited degradation of the protein partially. This was an importantdemonstration that a small molecule ligand can stabilize a proteinotherwise targeted for degradation in cells. A rapamycin derivative wasused to stabilize an unstable mutant of the FRB domain of mTOR (FRB*)and restore the function of the fused kinase, GSK-3β.6,7 This systemdemonstrated that ligand-dependent stability represented an attractivestrategy to regulate the function of a specific protein in a complexbiological environment. A system to control protein activity can involvethe DD becoming functional when the ubiquitin complementation occurs byrapamycin induced dimerization of FK506-binding protein and FKBP12.Mutants of human FKBP12 or ecDHFR protein can be engineered to bemetabolically unstable in the absence of their high-affinity ligands,Shield-1 or trimethoprim (TMP), respectively. These mutants are some ofthe possible destabilizing domains (DDs) useful in the practice of theinvention and instability of a DD as a fusion with a Cas13 confers tothe Cas13 degradation of the entire fusion protein by the proteasome.Shield-1 and TMP bind to and stabilize the DD in a dose-dependentmanner. The estrogen receptor ligand binding domain (ERLBD, residues305-549 of ERS1) can also be engineered as a destabilizing domain. Sincethe estrogen receptor signaling pathway is involved in a variety ofdiseases such as breast cancer, the pathway has been widely studied andnumerous agonist and antagonists of estrogen receptor have beendeveloped. Thus, compatible pairs of ERLBD and drugs are known. Thereare ligands that bind to mutant but not wild-type forms of the ERLBD. Byusing one of these mutant domains encoding three mutations (L384M,M421G, G521R)12, it is possible to regulate the stability of anERLBD-derived DD using a ligand that does not perturb endogenousestrogen-sensitive networks. An additional mutation (Y537S) can beintroduced to further destabilize the ERLBD and to configure it as apotential DD candidate. This tetra-mutant is an advantageous DDdevelopment. The mutant ERLBD can be fused to a Cas13 and its stabilitycan be regulated or perturbed using a ligand, whereby the Cas13 has aDD. Another DD can be a 12-kDa (107-amino-acid) tag based on a mutatedFKBP protein, stabilized by Shield1 ligand; see, e.g., Nature Methods 5,(2008). For instance a DD can be a modified FK506 binding protein 12(FKBP12) that binds to and is reversibly stabilized by a synthetic,biologically inert small molecule, Shield-1; see, e.g., Banaszynski L A,Chen L C, Maynard-Smith L A, Ooi A G, Wandless T J. A rapid, reversible,and tunable method to regulate protein function in living cells usingsynthetic small molecules. Cell. 2006; 126:995-1004; Banaszynski L A,Sellmyer M A, Contag C H, Wandless T J, Thorne S H. Chemical control ofprotein stability and function in living mice. Nat Med. 2008;14:1123-1127; Maynard-Smith L A, Chen L C, Banaszynski L A, Ooi A G,Wandless T J. A directed approach for engineering conditional proteinstability using biologically silent small molecules. The Journal ofbiological chemistry. 2007; 282:24866-24872; and Rodriguez, Chem Biol.Mar. 23, 2012; 19(3): 391-398—all of which are incorporated herein byreference and may be employed in the practice of the invention inselected a DD to associate with a Cas13 in the practice of thisinvention. As can be seen, the knowledge in the art includes a number ofDDs, and the DD can be associated with, e.g., fused to, advantageouslywith a linker, to a Cas13, whereby the DD can be stabilized in thepresence of a ligand and when there is the absence thereof the DD canbecome destabilized, whereby the Cas13 is entirely destabilized, or theDD can be stabilized in the absence of a ligand and when the ligand ispresent the DD can become destabilized; the DD allows the Cas13 andhence the CRISPR-Cas13 complex or system to be regulated orcontrolled—turned on or off so to speak, to thereby provide means forregulation or control of the system, e.g., in an in vivo or in vitroenvironment. For instance, when a protein of interest is expressed as afusion with the DD tag, it is destabilized and rapidly degraded in thecell, e.g., by proteasomes. Thus, absence of stabilizing ligand leads toa D associated Cas being degraded. When a new DD is fused to a proteinof interest, its instability is conferred to the protein of interest,resulting in the rapid degradation of the entire fusion protein. Peakactivity for Cas is sometimes beneficial to reduce off-target effects.Thus, short bursts of high activity are preferred. The present inventionis able to provide such peaks. In some senses the system is inducible.In some other senses, the system repressed in the absence of stabilizingligand and de-repressed in the presence of stabilizing ligand.

Dead Cas Proteins

In certain embodiments, the effector protein herein is a catalyticallyinactive or dead Cas protein. In some cases, the effector protein(CRISPR enzyme; Cas13; effector protein) according to the invention asdescribed herein is a catalytically inactive or dead Cas13 effectorprotein (dCas13). In some cases, a dead Cas protein, e.g., a dead Cas13protein has nickase activity. In some embodiments, the dCas13 effectorcomprises mutations in the nuclease domain. In some embodiments, thedCas13 effector protein has been truncated. In some cases, the dead Casproteins may be fused with a deaminase herein, e.g., an adenosinedeaminase.

To reduce the size of a fusion protein of the Cas13 effector and the oneor more functional domains, the C-terminus of the Cas13 effector can betruncated while still maintaining its RNA binding function. For example,at least 20 amino acids, at least 40 amino acids, at least 50 aminoacids, at least 60 amino acids, at least 80 amino acids, at least 100amino acids, at least 120 amino acids, at least 140 amino acids, atleast 150 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 250 amino acids, at least 260 amino acids, or atleast 300 amino acids, or at least 350 amino acids, or up to 120 aminoacids, or up to 140 amino acids, or up to 160 amino acids, or up to 180amino acids, or up to 200 amino acids, or up to 250 amino acids, or upto 300 amino acids, or up to 350 amino acids, or up to 400 amino acids,may be truncated at the C-terminus of the Cas13 effector. Specificexamples of Cas13 truncations include C-terminal 4984-1090, C-terminal41026-1090, and C-terminal 41053-1090, C-terminal 4934-1090, C-terminal4884-1090, C-terminal 4834-1090, C-terminal 4784-1090, and C-terminal4734-1090, wherein amino acid positions correspond to amino acidpositions of Prevotella sp. P5-125 Cas13b protein. The skilled personwill understand that similar truncations can be designed for otherCas13b orthologues, or other Cas13 types or subtypes, such as Cas13a,Cas13c, or Cas13d. In some cases, the truncated Cas13b is encoded by nt1-984 of Prevotella sp. P5-125 Cas13b or the corresponding nt of aCas13b orthologue or homologue. Examples of Cas13 truncations alsoinclude C-terminal Δ 795-1095, wherein amino acid positions correspondto amino acid positions of Riemerella anatipestifer Cas13b protein.Examples of Cas13 truncations further include C-terminal Δ 875-1175,C-terminal 895-1175, C-terminal Δ 915-1175, C-terminal Δ 935-1175,C-terminal Δ 955-1175, C-terminal 975-1175, C-terminal Δ 995-1175,C-terminal Δ 1015-1175, C-terminal Δ 1035-1175, C-terminal Δ 1055-1175,C-terminal Δ 1075-1175, C-terminal Δ 1095-1175, C-terminal Δ 1115-1175,C-terminal Δ 1135-1175, C-terminal Δ 1155-1175, wherein amino acidpositions correspond to amino acid positions of Porphyromonas gulaeCas13b protein.

In some embodiments, the N-terminus of the Cas13 effector protein may betruncated. For example, at least 20 amino acids, at least 40 aminoacids, at least 50 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 150 amino acids, at least 160 aminoacids, at least 180 amino acids, at least 200 amino acids, at least 220amino acids, at least 240 amino acids, at least 250 amino acids, atleast 260 amino acids, or at least 300 amino acids, or at least 350amino acids, or up to 120 amino acids, or up to 140 amino acids, or upto 160 amino acids, or up to 180 amino acids, or up to 200 amino acids,or up to 250 amino acids, or up to 300 amino acids, or up to 350 aminoacids, or up to 400 amino acids, may be truncated at the N-terminus ofthe Cas13 effector. Examples of Cas13 truncations include N-terminalΔ41-125, N-terminal Δ 1-88, or N-terminal Δ1-72, wherein amino acidpositions of the truncations correspond to amino acid positions ofPrevotella sp. P5-125 Cas13b protein.

In some embodiments, both the N- and the C-termini of the Cas13 effectorprotein may be truncated. For example, at least 20 amino acids may betruncated at the C-terminus of the Cas13 effector, and at least 20 aminoacids, at least 40 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 260 amino acids, at least 300 amino acids, or atleast 350 amino acids may be truncated at the N-terminus of the Cas13effector. For example, at least 40 amino acids may be truncated at theC-terminus of the Cas13 effector, and at least 20 amino acids, at least40 amino acids, at least 60 amino acids, at least 80 amino acids, atleast 100 amino acids, at least 120 amino acids, at least 140 aminoacids, at least 160 amino acids, at least 180 amino acids, at least 200amino acids, at least 220 amino acids, at least 240 amino acids, atleast 260 amino acids, at least 300 amino acids, or at least 350 aminoacids may be truncated at the N-terminus of the Cas13 effector. Forexample, at least 60 amino acids may be truncated at the C-terminus ofthe Cas13 effector, and at least 20 amino acids, at least 40 aminoacids, at least 60 amino acids, at least 80 amino acids, at least 100amino acids, at least 120 amino acids, at least 140 amino acids, atleast 160 amino acids, at least 180 amino acids, at least 200 aminoacids, at least 220 amino acids, at least 240 amino acids, at least 260amino acids, at least 300 amino acids, or at least 350 amino acids maybe truncated at the N-terminus of the Cas13 effector. For example, atleast 80 amino acids may be truncated at the C-terminus of the Cas13effector, and at least 20 amino acids, at least 40 amino acids, at least60 amino acids, at least 80 amino acids, at least 100 amino acids, atleast 120 amino acids, at least 140 amino acids, at least 160 aminoacids, at least 180 amino acids, at least 200 amino acids, at least 220amino acids, at least 240 amino acids, at least 260 amino acids, atleast 300 amino acids, or at least 350 amino acids may be truncated atthe N-terminus of the Cas13 effector. For example, at least 100 aminoacids may be truncated at the C-terminus of the Cas13 effector, and atleast 20 amino acids, at least 40 amino acids, at least 60 amino acids,at least 80 amino acids, at least 100 amino acids, at least 120 aminoacids, at least 140 amino acids, at least 160 amino acids, at least 180amino acids, at least 200 amino acids, at least 220 amino acids, atleast 240 amino acids, at least 260 amino acids, at least 300 aminoacids, or at least 350 amino acids may be truncated at the N-terminus ofthe Cas13 effector. For example, at least 120 amino acids may betruncated at the C-terminus of the Cas13 effector, and at least 20 aminoacids, at least 40 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 260 amino acids, at least 300 amino acids, or atleast 350 amino acids may be truncated at the N-terminus of the Cas13effector. For example, at least 140 amino acids may be truncated at theC-terminus of the Cas13 effector, and at least 20 amino acids, at least40 amino acids, at least 60 amino acids, at least 80 amino acids, atleast 100 amino acids, at least 120 amino acids, at least 140 aminoacids, at least 160 amino acids, at least 180 amino acids, at least 200amino acids, at least 220 amino acids, at least 240 amino acids, atleast 260 amino acids, at least 300 amino acids, or at least 350 aminoacids may be truncated at the N-terminus of the Cas13 effector. Forexample, at least 160 amino acids may be truncated at the C-terminus ofthe Cas13 effector, and at least 20 amino acids, at least 40 aminoacids, at least 60 amino acids, at least 80 amino acids, at least 100amino acids, at least 120 amino acids, at least 140 amino acids, atleast 160 amino acids, at least 180 amino acids, at least 200 aminoacids, at least 220 amino acids, at least 240 amino acids, at least 260amino acids, at least 300 amino acids, or at least 350 amino acids maybe truncated at the N-terminus of the Cas13 effector. For example, atleast 180 amino acids may be truncated at the C-terminus of the Cas13effector, and at least 20 amino acids, at least 40 amino acids, at least60 amino acids, at least 80 amino acids, at least 100 amino acids, atleast 120 amino acids, at least 140 amino acids, at least 160 aminoacids, at least 180 amino acids, at least 200 amino acids, at least 220amino acids, at least 240 amino acids, at least 260 amino acids, atleast 300 amino acids, or at least 350 amino acids may be truncated atthe N-terminus of the Cas13 effector. For example, at least 200 aminoacids may be truncated at the C-terminus of the Cas13 effector, and atleast 20 amino acids, at least 40 amino acids, at least 60 amino acids,at least 80 amino acids, at least 100 amino acids, at least 120 aminoacids, at least 140 amino acids, at least 160 amino acids, at least 180amino acids, at least 200 amino acids, at least 220 amino acids, atleast 240 amino acids, at least 260 amino acids, at least 300 aminoacids, or at least 350 amino acids may be truncated at the N-terminus ofthe Cas13 effector. For example, at least 220 amino acids may betruncated at the C-terminus of the Cas13 effector, and at least 20 aminoacids, at least 40 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 260 amino acids, at least 300 amino acids, or atleast 350 amino acids may be truncated at the N-terminus of the Cas13effector. For example, at least 240 amino acids may be truncated at theC-terminus of the Cas13 effector, and at least 20 amino acids, at least40 amino acids, at least 60 amino acids, at least 80 amino acids, atleast 100 amino acids, at least 120 amino acids, at least 140 aminoacids, at least 160 amino acids, at least 180 amino acids, at least 200amino acids, at least 220 amino acids, at least 240 amino acids, atleast 260 amino acids, at least 300 amino acids, or at least 350 aminoacids may be truncated at the N-terminus of the Cas13 effector. Forexample, at least 260 amino acids may be truncated at the C-terminus ofthe Cas13 effector, and at least 20 amino acids, at least 40 aminoacids, at least 60 amino acids, at least 80 amino acids, at least 100amino acids, at least 120 amino acids, at least 140 amino acids, atleast 160 amino acids, at least 180 amino acids, at least 200 aminoacids, at least 220 amino acids, at least 240 amino acids, at least 260amino acids, at least 300 amino acids, or at least 350 amino acids maybe truncated at the N-terminus of the Cas13 effector. For example, atleast 280 amino acids may be truncated at the C-terminus of the Cas13effector, and at least 20 amino acids, at least 40 amino acids, at least60 amino acids, at least 80 amino acids, at least 100 amino acids, atleast 120 amino acids, at least 140 amino acids, at least 160 aminoacids, at least 180 amino acids, at least 200 amino acids, at least 220amino acids, at least 240 amino acids, at least 260 amino acids, atleast 300 amino acids, or at least 350 amino acids may be truncated atthe N-terminus of the Cas13 effector. For example, at least 300 aminoacids may be truncated at the C-terminus of the Cas13 effector, and atleast 20 amino acids, at least 40 amino acids, at least 60 amino acids,at least 80 amino acids, at least 100 amino acids, at least 120 aminoacids, at least 140 amino acids, at least 160 amino acids, at least 180amino acids, at least 200 amino acids, at least 220 amino acids, atleast 240 amino acids, at least 260 amino acids, at least 300 aminoacids, or at least 350 amino acids may be truncated at the N-terminus ofthe Cas13 effector. For example, at least 350 amino acids may betruncated at the C-terminus of the Cas13 effector, and at least 20 aminoacids, at least 40 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 260 amino acids, at least 300 amino acids, or atleast 350 amino acids may be truncated at the N-terminus of the Cas13effector. For example, at least 20 amino acids may be truncated at theN-terminus of the Cas13 effector, and at least 20 amino acids, at least40 amino acids, at least 60 amino acids, at least 80 amino acids, atleast 100 amino acids, at least 120 amino acids, at least 140 aminoacids, at least 160 amino acids, at least 180 amino acids, at least 200amino acids, at least 220 amino acids, at least 240 amino acids, atleast 260 amino acids, at least 300 amino acids, or at least 350 aminoacids may be truncated at the C-terminus of the Cas13 effector. Forexample, at least 40 amino acids may be truncated at the N-terminus ofthe Cas13 effector, and at least 20 amino acids, at least 40 aminoacids, at least 60 amino acids, at least 80 amino acids, at least 100amino acids, at least 120 amino acids, at least 140 amino acids, atleast 160 amino acids, at least 180 amino acids, at least 200 aminoacids, at least 220 amino acids, at least 240 amino acids, at least 260amino acids, at least 300 amino acids, or at least 350 amino acids maybe truncated at the C-terminus of the Cas13 effector. For example, atleast 60 amino acids may be truncated at the N-terminus of the Cas13effector, and at least 20 amino acids, at least 40 amino acids, at least60 amino acids, at least 80 amino acids, at least 100 amino acids, atleast 120 amino acids, at least 140 amino acids, at least 160 aminoacids, at least 180 amino acids, at least 200 amino acids, at least 220amino acids, at least 240 amino acids, at least 260 amino acids, atleast 300 amino acids, or at least 350 amino acids may be truncated atthe C-terminus of the Cas13 effector. For example, at least 80 aminoacids may be truncated at the N-terminus of the Cas13 effector, and atleast 20 amino acids, at least 40 amino acids, at least 60 amino acids,at least 80 amino acids, at least 100 amino acids, at least 120 aminoacids, at least 140 amino acids, at least 160 amino acids, at least 180amino acids, at least 200 amino acids, at least 220 amino acids, atleast 240 amino acids, at least 260 amino acids, at least 300 aminoacids, or at least 350 amino acids may be truncated at the C-terminus ofthe Cas13 effector. For example, at least 100 amino acids may betruncated at the N-terminus of the Cas13 effector, and at least 20 aminoacids, at least 40 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 260 amino acids, at least 300 amino acids, or atleast 350 amino acids may be truncated at the C-terminus of the Cas13effector. For example, at least 120 amino acids may be truncated at theN-terminus of the Cas13 effector, and at least 20 amino acids, at least40 amino acids, at least 60 amino acids, at least 80 amino acids, atleast 100 amino acids, at least 120 amino acids, at least 140 aminoacids, at least 160 amino acids, at least 180 amino acids, at least 200amino acids, at least 220 amino acids, at least 240 amino acids, atleast 260 amino acids, at least 300 amino acids, or at least 350 aminoacids may be truncated at the C-terminus of the Cas13 effector. Forexample, at least 140 amino acids may be truncated at the N-terminus ofthe Cas13 effector, and at least 20 amino acids, at least 40 aminoacids, at least 60 amino acids, at least 80 amino acids, at least 100amino acids, at least 120 amino acids, at least 140 amino acids, atleast 160 amino acids, at least 180 amino acids, at least 200 aminoacids, at least 220 amino acids, at least 240 amino acids, at least 260amino acids, at least 300 amino acids, or at least 350 amino acids maybe truncated at the C-terminus of the Cas13 effector. For example, atleast 160 amino acids may be truncated at the N-terminus of the Cas13effector, and at least 20 amino acids, at least 40 amino acids, at least60 amino acids, at least 80 amino acids, at least 100 amino acids, atleast 120 amino acids, at least 140 amino acids, at least 160 aminoacids, at least 180 amino acids, at least 200 amino acids, at least 220amino acids, at least 240 amino acids, at least 260 amino acids, atleast 300 amino acids, or at least 350 amino acids may be truncated atthe C-terminus of the Cas13 effector. For example, at least 180 aminoacids may be truncated at the N-terminus of the Cas13 effector, and atleast 20 amino acids, at least 40 amino acids, at least 60 amino acids,at least 80 amino acids, at least 100 amino acids, at least 120 aminoacids, at least 140 amino acids, at least 160 amino acids, at least 180amino acids, at least 200 amino acids, at least 220 amino acids, atleast 240 amino acids, at least 260 amino acids, at least 300 aminoacids, or at least 350 amino acids may be truncated at the C-terminus ofthe Cas13 effector. For example, at least 200 amino acids may betruncated at the N-terminus of the Cas13 effector, and at least 20 aminoacids, at least 40 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 260 amino acids, at least 300 amino acids, or atleast 350 amino acids may be truncated at the C-terminus of the Cas13effector. For example, at least 220 amino acids may be truncated at theN-terminus of the Cas13 effector, and at least 20 amino acids, at least40 amino acids, at least 60 amino acids, at least 80 amino acids, atleast 100 amino acids, at least 120 amino acids, at least 140 aminoacids, at least 160 amino acids, at least 180 amino acids, at least 200amino acids, at least 220 amino acids, at least 240 amino acids, atleast 260 amino acids, at least 300 amino acids, or at least 350 aminoacids may be truncated at the C-terminus of the Cas13 effector. Forexample, at least 240 amino acids may be truncated at the N-terminus ofthe Cas13 effector, and at least 20 amino acids, at least 40 aminoacids, at least 60 amino acids, at least 80 amino acids, at least 100amino acids, at least 120 amino acids, at least 140 amino acids, atleast 160 amino acids, at least 180 amino acids, at least 200 aminoacids, at least 220 amino acids, at least 240 amino acids, at least 260amino acids, at least 300 amino acids, or at least 350 amino acids maybe truncated at the C-terminus of the Cas13 effector. For example, atleast 260 amino acids may be truncated at the N-terminus of the Cas13effector, and at least 20 amino acids, at least 40 amino acids, at least60 amino acids, at least 80 amino acids, at least 100 amino acids, atleast 120 amino acids, at least 140 amino acids, at least 160 aminoacids, at least 180 amino acids, at least 200 amino acids, at least 220amino acids, at least 240 amino acids, at least 260 amino acids, atleast 300 amino acids, or at least 350 amino acids may be truncated atthe C-terminus of the Cas13 effector. For example, at least 280 aminoacids may be truncated at the N-terminus of the Cas13 effector, and atleast 20 amino acids, at least 40 amino acids, at least 60 amino acids,at least 80 amino acids, at least 100 amino acids, at least 120 aminoacids, at least 140 amino acids, at least 160 amino acids, at least 180amino acids, at least 200 amino acids, at least 220 amino acids, atleast 240 amino acids, at least 260 amino acids, at least 300 aminoacids, or at least 350 amino acids may be truncated at the C-terminus ofthe Cas13 effector. For example, at least 300 amino acids may betruncated at the N-terminus of the Cas13 effector, and at least 20 aminoacids, at least 40 amino acids, at least 60 amino acids, at least 80amino acids, at least 100 amino acids, at least 120 amino acids, atleast 140 amino acids, at least 160 amino acids, at least 180 aminoacids, at least 200 amino acids, at least 220 amino acids, at least 240amino acids, at least 260 amino acids, at least 300 amino acids, or atleast 350 amino acids may be truncated at the C-terminus of the Cas13effector. For example, at least 350 amino acids may be truncated at theN-terminus of the Cas13 effector, and at least 20 amino acids, at least40 amino acids, at least 60 amino acids, at least 80 amino acids, atleast 100 amino acids, at least 120 amino acids, at least 140 aminoacids, at least 160 amino acids, at least 180 amino acids, at least 200amino acids, at least 220 amino acids, at least 240 amino acids, atleast 260 amino acids, at least 300 amino acids, or at least 350 aminoacids may be truncated at the C-terminus of the Cas13 effector.

Split Proteins

It is noted that in this context, and more generally for the variousapplications as described herein, the use of a split version of the RNAtargeting effector protein can be envisaged. Indeed, this may not onlyallow increased specificity but may also be advantageous for delivery.The Cas13 is split in the sense that the two parts of the Cas13 enzymesubstantially comprise a functioning Cas13. Ideally, the split shouldalways be so that the catalytic domain(s) are unaffected. That Cas13 mayfunction as a nuclease or it may be a dead-Cas13 which is essentially anRNA-binding protein with very little or no catalytic activity, due totypically mutation(s) in its catalytic domains.

Each half of the split Cas13 may be fused to a dimerization partner. Bymeans of example, and without limitation, employing rapamycin sensitivedimerization domains, allows to generate a chemically inducible splitCas13 for temporal control of Cas13 activity. Cas13 can thus be renderedchemically inducible by being split into two fragments and thatrapamycin-sensitive dimerization domains may be used for controlledreassembly of the Cas13. The two parts of the split Cas13 can be thoughtof as the N′ terminal part and the C′ terminal part of the split Cas13.The fusion is typically at the split point of the Cas13. In other words,the C′ terminal of the N′ terminal part of the split Cas13 is fused toone of the dimer halves, whilst the N′ terminal of the C′ terminal partis fused to the other dimer half.

The Cas13 does not have to be split in the sense that the break is newlycreated. The split point is typically designed in silico and cloned intothe constructs. Together, the two parts of the split Cas13, the N′terminal and C′ terminal parts, form a full Cas13, comprising preferablyat least 70% or more of the wildtype amino acids (or nucleotidesencoding them), preferably at least 80% or more, preferably at least 90%or more, preferably at least 95% or more, and most preferably at least99% or more of the wildtype amino acids (or nucleotides encoding them).Some trimming may be possible, and mutants are envisaged. Non-functionaldomains may be removed entirely. What is important is that the two partsmay be brought together and that the desired Cas13 function is restoredor reconstituted. The dimer may be a homodimer or a heterodimer.

In certain embodiments, the Cas13 effector as described herein may beused for mutation-specific, or allele-specific targeting, such as. formutation-specific, or allele-specific knockdown.

The RNA targeting effector protein can moreover be fused to anotherfunctional RNase domain, such as a non-specific RNase or Argonaute 2,which acts in synergy to increase the RNase activity or to ensurefurther degradation of the message.

Modulating Cas13 Effector Proteins

The invention provides accessory proteins that modulate CRISPR proteinfunction. In certain embodiments, the accessory protein modulatescatalytic activity of a CRISPR protein. In an embodiment of theinvention an accessory protein modulates targeted, or sequence specific,nuclease activity. In an embodiment of the invention, an accessoryprotein modulates collateral nuclease activity. In an embodiment of theinvention, an accessory protein modulates binding to a target nucleicacid.

According to the invention, the nuclease activity to be modulated can bedirected against nucleic acids comprising or consisting of RNA,including without limitation mRNA, miRNA, siRNA and nucleic acidscomprising cleavable RNA linkages along with nucleotide analogs. In anembodiment of the invention, the nuclease activity to be modulated canbe directed against nucleic acids comprising or consisting of DNA,including without limitation nucleic acids comprising cleavable DNAlinkages and nucleic acid analogs.

In an embodiment of the invention, an accessory protein enhances anactivity of a CRISPR protein. In certain such embodiments, the accessoryprotein comprises a HEPN domain and enhances RNA cleavage. In certainembodiments, the accessory protein inhibits an activity of a CRISPRprotein. In certain such embodiments, the accessory protein comprises aninactivated HEPN domain or lacks an HEPN domain altogether.

According to the invention, naturally occurring accessory proteins ofType VI CRISPR systems comprise small proteins encoded at or near aCRISPR locus that function to modify an activity of a CRISPR protein. Ingeneral, a CRISPR locus can be identified as comprising a putativeCRISPR array and/or encoding a putative CRISPR effector protein. In anembodiment, an effector protein can be from 800 to 2000 amino acids, orfrom 900 to 1800 amino acids, or from 950 to 1300 amino acids. In anembodiment, an accessory protein can be encoded within 25 kb, or within20 kb or within 15 kb, or within 10 kb of a putative CRISPR effectorprotein or array, or from 2 kb to 10 kb from a putative CRISPR effectorprotein or array.

In an embodiment of the invention, an accessory protein is from 50 to300 amino acids, or from 100 to 300 amino acids or from 150 to 250 aminoacids or about 200 amino acids. Non-limiting examples of accessoryproteins include the csx27 and csx28 proteins identified herein.

Identification and use of a CRISPR accessory protein of the invention isindependent of CRISPR effector protein classification. Accessoryproteins of the invention can be found in association with or engineeredto function with a variety of CRISPR effector proteins. Examples ofaccessory proteins identified and used herein are representative ofCRISPR effector proteins generally. It is understood that CRISPReffector protein classification may involve homology, feature location(e.g., location of REC domains, NUC domains, HEPN sequences), nucleicacid target (e.g. DNA or RNA), absence or presence of tracr RNA,location of guide/spacer sequence 5′ or 3′ of a direct repeat, or othercriteria. In embodiments of the invention, accessory proteinidentification and use transcend such classifications.

In type VI CRISPR-Cas systems that target RNA, the Cas proteins usuallycomprise two conserved HEPN domains which are involved in RNA cleavage.In certain embodiments, the Cas protein processes crRNA to generatemature crRNA. The guide sequence of the crRNA recognizes target RNA witha complementary sequence and the Cas protein degrades the target strand.More particularly, in certain embodiments, upon target binding, the Casprotein undergoes a structural rearrangement that brings two HEPNdomains together to form an active HEPN catalytic site and the targetRNA is then cleaved. The location of the catalytic site near the surfaceof the Cas protein allows non-specific collateral ssRNA cleavage.

In certain embodiments, accessory proteins are instrumental inincreasing or reducing target and/or collateral RNA cleavage. Withoutbeing bound by theory, an accessory protein that activates CRISPRactivity (e.g., a csx28 protein or ortholog or variant comprising a HEPNdomain) can be envisioned as capable of interacting with a Cas proteinand combining its HEPN domain with a HEPN domain of the Cas protein toform an active HEPN catalytic site, whereas an inhibitory accessoryprotein (e.g. csx27 with lacks an HEPN domain) can be envisioned ascapable of interacting with a Cas protein and reducing or blocking aconformation of the Cas protein that would bring together two HEPNdomains.

According to the invention, in certain embodiments, enhancing activityof a Type VI Cas protein or complex thereof comprises contacting theType VI Cas protein or complex thereof with an accessory protein fromthe same organism that activates the Cas protein. In other embodiments,enhancing activity of a Type VI Cas protein of complex thereof comprisescontacting the Type VI Cas protein or complex thereof with an activatoraccessory protein from a different organism within the same subclass(e.g., Type VI-b). In other embodiments, enhancing activity of a Type VICas protein or complex thereof comprises contacting the Type VI Casprotein or complex thereof with an accessory protein not within thesubclass (e.g., a Type VI Cas protein other than Type VI-b with a TypeVI-b accessory protein or vice-versa).

According to the invention, in certain embodiments, repressing activityof a Type VI Cas protein or complex thereof comprises contacting theType VI Cas protein or complex thereof with an accessory protein fromthe same organism that represses the Cas protein. In other embodiments,repressing activity of a Type VI Cas protein or complex thereofcomprises contacting the Type VI Cas protein or complex thereof with arepressor accessory protein from a different organism within the samesubclass (e.g., Type VI-b). In other embodiments, repressing activity ofa Type VI Cas protein or complex thereof comprises contacting the TypeVI Cas protein or complex thereof with a repressor accessory protein notwithin the subclass (e.g., a Type VI Cas protein other than Type VI-bwith a Type VI-b repressor accessory protein or vice-versa).

In certain embodiments where the Type VI Cas protein and the Type VIaccessory protein are from the same organism, the two proteins willfunction together in an engineered CRISPR system. In certainembodiments, it will be desirable to alter the function of theengineered CRISPR system, for example by modifying either or both of theproteins or their expression. In embodiments where the Type VI Casprotein and the Type VI accessory protein are from different organismswhich may be within the same class or different classes, the proteinsmay function together in an engineered CRISPR system but it will oftenbe desired or necessary to modify either or both of the proteins tofunction together.

Accordingly, in certain embodiments of the invention either or both of aCas protein and an accessory protein may be modified to adjust aspectsof protein-protein interactions between the Cas protein and accessoryprotein. In certain embodiments, either or both of a Cas protein and anaccessory protein may be modified to adjust aspects of protein-nucleicacid interactions. Ways to adjust protein-protein interactions andprotein-nucleic acid interaction include without limitation, fittingmolecular surfaces, polar interactions, hydrogen bonds, and modulatingvan der Waals interactions. In certain embodiments, adjustingprotein-protein interactions or protein-nucleic acid binding comprisesincreasing or decreasing binding interactions. In certain embodiments,adjusting protein-protein interactions or protein-nucleic acid bindingcomprises modifications that favor or disfavor a conformation of theprotein or nucleic acid.

By “fitting”, is meant determining including by automatic, orsemi-automatic means, interactions between one or more atoms of a Cas13protein (and optionally at least one atoms of a Cas13 accessoryprotein), or between one or more atoms of a Cas13 protein and one ormore atoms of a nucleic acid, (or optionally between one or more atomsof a Cas13 accessory protein and a nucleic acid), and calculating theextent to which such interactions are stable. Interactions includeattraction and repulsion, brought about by charge, steric considerationsand the like.

The three-dimensional structure of Type VI CRISPR protein or complexthereof (and/or a Type VI CRISPR accessory protein or complex thereof inthe context of Cas13b) provides in the context of the instant inventionan additional tool for identifying additional mutations in orthologs ofCas13. The crystal structure can also be basis for the design of new andspecific Cas13s (and optionally Cas13 accessory proteins). Variouscomputer-based methods for fitting are described further. Bindinginteractions of Cas13s (and optionally accessory proteins), and nucleicacids can be examined through the use of computer modeling using adocking program. Docking programs are known; for example GRAM, DOCK orAUTODOCK (see Walters et al. Drug Discovery Today, vol. 3, no. 4 (1998),160-178, and Dunbrack et al. Folding and Design 2 (1997), 27-42). Thisprocedure can include computer fitting to ascertain how well the shapeand the chemical structure of the binding partners. Computer-assisted,manual examination of the active site or binding site of a Type VIsystem may be performed. Programs such as GRID (P. Goodford, J. Med.Chem, 1985, 28, 849-57)—a program that determines probable interactionsites between molecules with various functional groups—may also be usedto analyze the active site or binding site to predict partial structuresof binding compounds. Computer programs can be employed to estimate theattraction, repulsion or steric hindrance of the two binding partners,e.g., components of a Type VI CRISPR system, or a nucleic acid moleculeand a component of a Type VI CRISPR system.

Amino acid substitutions may be made on the basis of differences orsimilarities in amino acid properties (such as polarity, charge,solubility, hydrophobicity, hydrophilicity, and/or the amphipathicnature of the residues) and it is therefore useful to group amino acidstogether in functional groups. Amino acids may be grouped together basedon the properties of their side chains alone. In comparing orthologs,there are likely to be residues conserved for structural or catalyticreasons. These sets may be described in the form of a Venn diagram(Livingstone C. D. and Barton G. J. (1993) “Protein sequence alignments:a strategy for the hierarchical analysis of residue conservation”Comput. Appl. Biosci. 9: 745-756) (Taylor W.R. (1986) “Theclassification of amino acid conservation” J. Theor. Biol. 119;205-218). Conservative substitutions may be made, for example accordingto the table below which describes a generally accepted Venn diagramgrouping of amino acids (see Table 7 below).

TABLE 7 Set Sub-set Hydrophobic F W Y H K M I L V A G C Aromatic F W Y H(SEQ ID NO: 240) Aliphatic I L V Polar W Y H K R E D C S T N Q ChargedH K R E D (SEQ ID NO: 241) Positively charged H K R Negatively chargedE D Small V C A G S P T N D Tiny A G S (SEQ ID NO: 242)

In an engineered Cas13 system, modification may comprise modification ofone or more amino acid residues of the Cas13 protein (and/or maycomprise modification of one or more amino acid residues of the Cas13accessory protein in the case of Cas13b).

In an engineered Cas13 system, modification may comprise modification ofone or more amino acid residues located in a region which comprisesresidues which are positively charged in the unmodified Cas13 protein(and/or Cas13 accessory protein).

In an engineered Cas13 system, modification may comprise modification ofone or more amino acid residues which are positively charged in theunmodified Cas13 protein (and/or Cas13 accessory protein).

In an engineered Cas13 system, modification may comprise modification ofone or more amino acid residues which are not positively charged in theunmodified Cas13 protein (and/or Cas13 accessory protein).

The modification may comprise modification of one or more amino acidresidues which are uncharged in the unmodified Cas13 protein (and/orCas13 accessory protein).

The modification may comprise modification of one or more amino acidresidues which are negatively charged in the unmodified Cas13 protein(and/or Cas13 accessory protein).

The modification may comprise modification of one or more amino acidresidues which are hydrophobic in the unmodified Cas13 protein (and/orCas13 accessory protein).

The modification may comprise modification of one or more amino acidresidues which are polar in the unmodified Cas13 protein (and/or Cas13accessory protein).

The modification may comprise substitution of a hydrophobic amino acidor polar amino acid with a charged amino acid, which can be a negativelycharged or positively charged amino acid. The modification may comprisesubstitution of a negatively charged amino acid with a positivelycharged or polar or hydrophobic amino acid. The modification maycomprise substitution of a positively charged amino acid with anegatively charged or polar or hydrophobic amino acid.

Embodiments of the invention include sequences (both polynucleotide orpolypeptide) which may comprise homologous substitution (substitutionand replacement are both used herein to mean the interchange of anexisting amino acid residue or nucleotide, with an alternative residueor nucleotide) that may occur i.e., like-for-like substitution in thecase of amino acids such as basic for basic, acidic for acidic, polarfor polar, etc. Non-homologous substitution may also occur i.e., fromone class of residue to another or alternatively involving the inclusionof unnatural amino acids such as ornithine (hereinafter referred to asZ), diaminobutyric acid ornithine (hereinafter referred to as B),norleucine ornithine (hereinafter referred to as O), pyridylalanine,thienylalanine, naphthylalanine and phenylglycine. Variant amino acidsequences may include suitable spacer groups that may be insertedbetween any two amino acid residues of the sequence including alkylgroups such as methyl, ethyl or propyl groups in addition to amino acidspacers such as glycine or β-alanine residues. A further form ofvariation, which involves the presence of one or more amino acidresidues in peptoid form, may be well understood by those skilled in theart. For the avoidance of doubt, “the peptoid form” is used to refer tovariant amino acid residues wherein the α-carbon substituent group is onthe residue's nitrogen atom rather than the α-carbon. Processes forpreparing peptides in the peptoid form are known in the art, for exampleSimon R J et al., PNAS (1992) 89(20), 9367-9371 and Horwell D C, TrendsBiotechnol. (1995) 13(4), 132-134.

Homology modelling: Corresponding residues in other Cas13 orthologs canbe identified by the methods of Zhang et al., 2012 (Nature; 490(7421):556-60) and Chen et al., 2015 (PLoS Comput Biol; 11(5): e1004248)—acomputational protein-protein interaction (PPI) method to predictinteractions mediated by domain-motif interfaces. PrePPI (PredictingPPI), a structure based PPI prediction method, combines structuralevidence with non-structural evidence using a Bayesian statisticalframework. The method involves taking a pair a query proteins and usingstructural alignment to identify structural representatives thatcorrespond to either their experimentally determined structures orhomology models. Structural alignment is further used to identify bothclose and remote structural neighbors by considering global and localgeometric relationships. Whenever two neighbors of the structuralrepresentatives form a complex reported in the Protein Data Bank, thisdefines a template for modelling the interaction between the two queryproteins. Models of a complex are created by superimposing therepresentative structures on their corresponding structural neighbor inthe template. This approach is in Dey et al., 2013 (Prot Sci; 22:359-66).

Collateral Activity

Collateral activity was recently leveraged for a highly sensitive andspecific nucleic acid detection platform termed SHERLOCK that is usefulfor many clinical diagnoses (Gootenberg, J. S. et al. Nucleic aciddetection with CRISPR-Cas13a/C2c2. Science 356, 438-442 (2017)).

According to the invention, engineered CRISPR-Cas systems are optimizedfor RNA endonuclease activity and can be expressed in mammalian cellsand targeted to effectively knock down reporter molecules or transcriptsin cells.

The collateral effect of engineered CRISPR-Cas with isothermalamplification provides a CRISPR-based diagnostic providing rapid DNA orRNA detection with high sensitivity and single-base mismatchspecificity. The CRISPR-Cas-based molecular detection platform is usedto detect specific strains of virus, distinguish pathogenic bacteria,genotype human DNA, and identify cell-free tumor DNA mutations.Furthermore, reaction reagents can be lyophilized for cold-chainindependence and long-term storage, and readily reconstituted on paperfor field applications.

The ability to rapidly detect nucleic acids with high sensitivity andsingle-base specificity on a portable platform may aid in diseasediagnosis and monitoring, epidemiology, and general laboratory tasks.Although methods exist for detecting nucleic acids, they have trade-offsamong sensitivity, specificity, simplicity, cost, and speed.

Microbial Clustered Regularly Interspaced Short Palindromic Repeats(CRISPR) and CRISPR-associated (CRISPR-Cas) adaptive immune systemscontain programmable endonucleases that can be leveraged forCRISPR-based diagnostics (CRISPR-Dx). CRISPR-Cas can be reprogrammedwith CRISPR RNAs (crRNAs) to provide a platform for specific DNAsensing. Upon recognition of its DNA target, activated CRISPR-Casengages in “collateral” cleavage of nearby non-targeted nucleic acids(i.e., RNA and/or ssDNA). This crRNA-programmed collateral cleavageactivity allows CRISPR-Cas to detect the presence of a specific DNA invivo by triggering programmed cell death or by nonspecific degradationof labelled RNA or ssDNA. Here is described an in vitro nucleic aciddetection platform with high sensitivity based on nucleic acidamplification and CRISPR-Cas-mediated collateral cleavage of acommercial reporter RNA, allowing for real-time detection of the target.

Conservation of non-specific ss DNA and RNA directed proteins willinevitably lead to further and, potentially, improved CRISPR proteinsthat demonstrate collateral cleavage and may be used for detection andoffer greater breadth for multiplexed detection of nucleic acid targetsin amplified and highly sensitive, especially SHERLOCK, diagnosticsystems

RNA-Based Masking

In certain example embodiments, an RNA-based masking constructsuppresses generation of a detectable positive signal, or the RNA-basedmasking construct suppresses generation of a detectable positive signalby masking the detectable positive signal, or generating a detectablenegative signal instead, or the RNA-based masking construct comprises asilencing RNA that suppresses generation of a gene product encoded by areporting construct, wherein the gene product generates the detectablepositive signal when expressed.

In another example embodiment, the RNA-based masking construct is aribozyme that generates a negative detectable signal, and wherein thepositive detectable signal is generated when the ribozyme isdeactivated. In one example embodiment, the ribozyme converts asubstrate to a first color and wherein the substrate converts to asecond color when the ribozyme is deactivated. In another exampleembodiment, the RNA-based masking agent is an aptamer that sequesters anenzyme, wherein the enzyme generates a detectable signal upon releasefrom the aptamer by acting upon a substrate, or the aptamer sequesters apair of agents that when released from the aptamers combine to generatea detectable signal.

In another example embodiment, the RNA-based masking construct comprisesan RNA oligonucleotide to which are attached a detectable ligandoligonucleotide and a masking component. In certain example embodiments,the detectable ligand is a fluorophore and the masking component is aquencher molecule.

In another aspect, the invention provides a method for detecting targetnucleic acid (e.g.,) RNAs in samples, comprising: distributing a sampleor set of samples into one or more individual discrete volumes, theindividual discrete volumes comprising a CRISPR system comprising aneffector protein, one or more guide RNAs, an RNA-based maskingconstruct; incubating the sample or set of samples under conditionssufficient to allow binding of the one or more guide RNAs to one or moretarget molecules; activating the CRISPR effector protein via binding ofthe one or more guide RNAs to the one or more target molecules, whereinactivating the CRISPR effector protein results in modification of theRNA-based masking construct such that a detectable positive signal isproduced; and detecting the detectable positive signal, whereindetection of the detectable positive signal indicates a presence of oneor more target molecules in the sample.

In some embodiments, the method for detecting a target nucleic acid in asample comprising: contacting a sample with: an engineered CRISPR-Casprotein; at least one guide polynucleotide comprising a guide sequencecapable of binding to the target nucleic acid and designed to form acomplex with the engineered CRISPR-Cas; and a RNA-based maskingconstruct comprising a non-target sequence; wherein the engineeredCRISPR-Cas protein exhibits collateral RNase activity and cleaves thenon-target sequence of the detection construct; and detecting a signalfrom cleavage of the non-target sequence, thereby detecting the targetnucleic acid in the sample. In some embodiments, the method furthercomprises contacting the sample with reagents for amplifying the targetnucleic acid. In some embodiments, the reagents for amplifying comprisesisothermal amplification reaction reagents. In some embodiments, theisothermal amplification reagents comprise nucleic-acid sequence-basedamplification, recombinase polymerase amplification, loop-mediatedisothermal amplification, strand displacement amplification,helicase-dependent amplification, or nicking enzyme amplificationreagents.

In some embodiments, the target nucleic acid is DNA molecule and themethod further comprises contacting the target DNA molecule with aprimer comprising an RNA polymerase site and RNA polymerase.

In some embodiments, the masking construct: suppresses generation of adetectable positive signal until the masking construct cleaved ordeactivated, or masks a detectable positive signal or generates adetectable negative signal until the masking construct cleaved ordeactivated.

In some embodiments, the masking construct comprises: a. a silencing RNAthat suppresses generation of a gene product encoded by a reportingconstruct, wherein the gene product generates the detectable positivesignal when expressed; b. a ribozyme that generates the negativedetectable signal, and wherein the positive detectable signal isgenerated when the ribozyme is deactivated; or c. a ribozyme thatconverts a substrate to a first color and wherein the substrate convertsto a second color when the ribozyme is deactivated; d. an aptamer and/orcomprises a polynucleotide-tethered inhibitor; e. a polynucleotide towhich a detectable ligand and a masking component are attached; f. ananoparticle held in aggregate by bridge molecules, wherein at least aportion of the bridge molecules comprises a polynucleotide, and whereinthe solution undergoes a color shift when the nanoparticle is disbursedin solution; g. a quantum dot or fluorophore linked to one or morequencher molecules by a linking molecule, wherein at least a portion ofthe linking molecule comprises a polynucleotide; h. a polynucleotide incomplex with an intercalating agent, wherein the intercalating agentchanges absorbance upon cleavage of the polynucleotide; or l. twofluorophores tethered by a polynucleotide that undergo a shift influorescence when released from the polynucleotide. In some embodiments,the aptamer a. comprises a polynucleotide-tethered inhibitor thatsequesters an enzyme, wherein the enzyme generates a detectable signalupon release from the aptamer or polynucleotide-tethered inhibitor byacting upon a substrate; or b. is an inhibitory aptamer that inhibits anenzyme and prevents the enzyme from catalyzing generation of adetectable signal from a substrate or wherein thepolynucleotide-tethered inhibitor inhibits an enzyme and prevents theenzyme from catalyzing generation of a detectable signal from asubstrate; or c. sequesters a pair of agents that when released from theaptamers combine to generate a detectable signal. In some embodiments,the nanoparticle is a colloidal metal. In some embodiments, the at leastone guide polynucleotide comprises a mismatch. In some embodiments, themismatch is up- or downstream of a single nucleotide variation on theone or more guide sequences.

In another aspect, the invention provides a method for detectingpeptides in samples, comprising: distributing a sample or set of samplesinto a set of individual discrete volumes, the individual discretevolumes comprising peptide detection aptamers, a CRISPR systemcomprising an effector protein, one or more guide RNAs, an RNA-basedmasking construct, wherein the peptide detection aptamers comprising amasked RNA polymerase site and configured to bind one or more targetmolecules; incubating the sample or set of samples under conditionssufficient to allow binding of the peptide detection aptamers to the oneor more target molecules, wherein binding of the aptamer to acorresponding target molecule exposes the RNA polymerase binding siteresulting in RNA synthesis of a trigger RNA; activating the CRISPReffector protein via binding of the one or more guide RNAs to thetrigger RNA, wherein activating the CRISPR effector protein results inmodification of the RNA-based masking construct such that a detectablepositive signal is produced; and detecting the detectable positivesignal, wherein detection of the detectable positive signal indicates apresence of one or more target molecules in a sample.

In certain example embodiments, the one or more guide RNAs are designedto bind to one or more target molecules that are diagnostic for adisease state. In certain other example embodiments, the disease stateis an infection, an organ disease, a blood disease, an immune systemdisease, a cancer, a brain and nervous system disease, an endocrinedisease, a pregnancy or childbirth-related disease, an inheriteddisease, or an environmentally-acquired disease, cancer, or a fungalinfection, a bacterial infection, a parasite infection, or a viralinfection.

In certain example embodiments, the RNA-based masking constructsuppresses generation of a detectable positive signal, or the RNA-basedmasking construct suppresses generation of a detectable positive signalby masking the detectable positive signal, or generating a detectablenegative signal instead, or the RNA-based masking construct comprises asilencing RNA that suppresses generation of a gene product encoded by areporting construct, wherein the gene product generates the detectablepositive signal when expressed, or the RNA-based masking construct is aribozyme that generates the negative detectable signal, and wherein thepositive detectable signal is generated when the ribozyme isinactivated. In other example embodiments, the ribozyme converts asubstrate to a first state and wherein the substrate converts to asecond state when the ribozyme is inactivated, or the RNA-based maskingagent is an aptamer, or the aptamer sequesters an enzyme, wherein theenzyme generates a detectable signal upon release from the aptamer byacting upon a substrate, or the aptamer sequesters a pair of agents thatwhen released from the aptamers combine to generate a detectable signal.In still further embodiments, the RNA-based masking construct comprisesan RNA oligonucleotide with a detectable ligand on a first end of theRNA oligonucleotide and a masking component on a second end of the RNAoligonucleotide, or the detectable ligand is a fluorophore and themasking component is a quencher molecule.

Base Editing

The present disclosure also provides for a base editing system. Ingeneral, such a system may comprise a deaminase (e.g., an adenosinedeaminase or cytidine deaminase) fused with a Cas protein. The Casprotein may be a dead Cas protein or a Cas nickase protein. In certainexamples, the system comprises a mutated form of an adenosine deaminasefused with a dead CRISPR-Cas or CRISPR-Cas nickase. The mutated form ofthe adenosine deaminase may have both adenosine deaminase and cytidinedeaminase activities.

In certain example embodiments, a dCas13b can be fused with an adenosinedeaminase or cytidine deaminase for base editing purposes. In somecases, the dCas13b is dCas13b-t1, dCas13b-t2, or dCas13b-t3.

In one aspect, the present disclosure provides an engineered adenosinedeaminase. The engineered adenosine deaminase may comprise one or moremutations herein. In some embodiments, the engineered adenosinedeaminase has cytidine deaminase activity. In certain examples, theengineered adenosine deaminase has both cytidine deaminase activity andadenosine deaminase. FIG. 101 shows an example system and method ofprogrammable cytidine to uridine conversion according to someembodiments herein. In some cases, the modifications by base editorsherein may be used for targeting post-translational signaling orcatalysis. FIG. 102 shows examples approaches.

Adenosine Deaminase

The term “adenosine deaminase” or “adenosine deaminase protein” as usedherein refers to a protein, a polypeptide, or one or more functionaldomain(s) of a protein or a polypeptide that is capable of catalyzing ahydrolytic deamination reaction that converts an adenine (or an adeninemoiety of a molecule) to a hypoxanthine (or a hypoxanthine moiety of amolecule), as shown below. In some embodiments, the adenine-containingmolecule is an adenosine (A), and the hypoxanthine-containing moleculeis an inosine (I). The adenine-containing molecule can bedeoxyribonucleic acid (DNA) or ribonucleic acid (RNA).

According to the present disclosure, adenosine deaminases that can beused in connection with the present disclosure include, but are notlimited to, members of the enzyme family known as adenosine deaminasesthat act on RNA (ADARs), members of the enzyme family known as adenosinedeaminases that act on tRNA (ADATs), and other adenosine deaminasedomain-containing (ADAD) family members. According to the presentdisclosure, the adenosine deaminase is capable of targeting adenine in aRNA/DNA and RNA duplexes. Indeed, Zheng et al. (Nucleic Acids Res. 2017,45(6): 3369-3377) demonstrate that ADARs can carry out adenosine toinosine editing reactions on RNA/DNA and RNA/RNA duplexes. In particularembodiments, the adenosine deaminase has been modified to increase itsability to edit DNA in a RNA/DNA heteroduplex of in an RNA duplex asdetailed herein below.

In some embodiments, the adenosine deaminase is derived from one or moremetazoa species, including but not limited to, mammals, birds, frogs,squids, fish, flies and worms. In some embodiments, the adenosinedeaminase is a human, squid or Drosophila adenosine deaminase.

In some embodiments, the adenosine deaminase is a human ADAR, includinghADAR1, hADAR2, hADAR3. In some embodiments, the adenosine deaminase isa Caenorhabditis elegans ADAR protein, including ADR-1 and ADR-2. Insome embodiments, the adenosine deaminase is a Drosophila ADAR protein,including dAdar. In some embodiments, the adenosine deaminase is a squidLoligo pealeii ADAR protein, including sqADAR2a and sqADAR2b. In someembodiments, the adenosine deaminase is a human ADAT protein. In someembodiments, the adenosine deaminase is a Drosophila ADAT protein. Insome embodiments, the adenosine deaminase is a human ADAD protein,including TENR (hADAD1) and TENRL (hADAD2).

In some embodiments, the adenosine deaminase is a TadA protein such asE. coli TadA. See Kim et al., Biochemistry 45:6407-6416 (2006); Wolf etal., EMBO J. 21:3841-3851 (2002). In some embodiments, the adenosinedeaminase is mouse ADA. See Grunebaum et al., Curr. Opin. Allergy Clin.Immunol. 13:630-638 (2013). In some embodiments, the adenosine deaminaseis human ADAT2. See Fukui et al., J. Nucleic Acids 2010:260512 (2010).In some embodiments, the deaminase (e.g., adenosine or cytidinedeaminase) is one or more of those described in Cox et al., Science.2017 Nov. 24; 358(6366): 1019-1027; Komore et al., Nature. 2016 May 19;533(7603):420-4; and Gaudelli et al., Nature. 2017 Nov. 23;551(7681):464-471.

In some embodiments, the adenosine deaminase protein recognizes andconverts one or more target adenosine residue(s) in a double-strandednucleic acid substrate into inosine residues (s). In some embodiments,the double-stranded nucleic acid substrate is a RNA-DNA hybrid duplex.In some embodiments, the adenosine deaminase protein recognizes abinding window on the double-stranded substrate. In some embodiments,the binding window contains at least one target adenosine residue(s). Insome embodiments, the binding window is in the range of about 3 bp toabout 100 bp. In some embodiments, the binding window is in the range ofabout 5 bp to about 50 bp. In some embodiments, the binding window is inthe range of about 10 bp to about 30 bp. In some embodiments, thebinding window is about 1 bp, 2 bp, 3 bp, 5 bp, 7 bp, 10 bp, 15 bp, 20bp, 25 bp, 30 bp, 40 bp, 45 bp, 50 bp, 55 bp, 60 bp, 65 bp, 70 bp, 75bp, 80 bp, 85 bp, 90 bp, 95 bp, or 100 bp.

In some embodiments, the adenosine deaminase protein comprises one ormore deaminase domains. Not intended to be bound by a particular theory,it is contemplated that the deaminase domain functions to recognize andconvert one or more target adenosine (A) residue(s) contained in adouble-stranded nucleic acid substrate into inosine (I) residue(s). Insome embodiments, the deaminase domain comprises an active center. Insome embodiments, the active center comprises a zinc ion. In someembodiments, during the A-to-I editing process, base pairing at thetarget adenosine residue is disrupted, and the target adenosine residueis “flipped” out of the double helix to become accessible by theadenosine deaminase. In some embodiments, amino acid residues in or nearthe active center interact with one or more nucleotide(s) 5′ to a targetadenosine residue. In some embodiments, amino acid residues in or nearthe active center interact with one or more nucleotide(s) 3′ to a targetadenosine residue. In some embodiments, amino acid residues in or nearthe active center further interact with the nucleotide complementary tothe target adenosine residue on the opposite strand. In someembodiments, the amino acid residues form hydrogen bonds with the 2′hydroxyl group of the nucleotides.

In some embodiments, the adenosine deaminase comprises human ADAR2 fullprotein (hADAR2) or the deaminase domain thereof (hADAR2-D). In someembodiments, the adenosine deaminase is an ADAR family member that ishomologous to hADAR2 or hADAR2-D.

Particularly, in some embodiments, the homologous ADAR protein is humanADAR1 (hADAR1) or the deaminase domain thereof (hADAR1-D). In someembodiments, glycine 1007 of hADAR1-D corresponds to glycine 487hADAR2-D, and glutamic Acid 1008 of hADAR1-D corresponds to glutamicacid 488 of hADAR2-D.

In some embodiments, the adenosine deaminase comprises the wild-typeamino acid sequence of hADAR2-D. In some embodiments, the adenosinedeaminase comprises one or more mutations in the hADAR2-D sequence, suchthat the editing efficiency, and/or substrate editing preference ofhADAR2-D is changed according to specific needs. The engineeredadenosine deaminase may be fused with a Cas protein, e.g., Cas9, Cas 12(e.g., Cas12a, Cas12b, Cas12c, Cas12d, etc.), Cas13 (e.g., Cas13a,Cas13b (such as Cas13b-t1, Cas13b-t2, Cas13b-t3), Cas13c, Cas13d, etc.),Cas14, CasX, CasY, or an engineered form of the Cas protein (e.g., aninvective, dead form, a nickase form). In some examples, provided hereininclude an engineered adenosine deaminase fused with a dead Cas13bprotein or Cas13 nickase.

Certain mutations of hADAR1 and hADAR2 proteins have been described inKuttan et al., Proc Natl Acad Sci USA. (2012) 109(48):E3295-304; Want etal. ACS Chem Biol. (2015) 10(11):2512-9; and Zheng et al. Nucleic AcidsRes. (2017) 45(6):3369-337, each of which is incorporated herein byreference in its entirety.

In some embodiments, the adenosine deaminase comprises a mutation atglycine336 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the glycineresidue at position 336 is replaced by an aspartic acid residue (G336D).

In some embodiments, the adenosine deaminase comprises a mutation atGlycine487 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the glycineresidue at position 487 is replaced by a non-polar amino acid residuewith relatively small side chains. For example, in some embodiments, theglycine residue at position 487 is replaced by an alanine residue(G487A). In some embodiments, the glycine residue at position 487 isreplaced by a valine residue (G487V). In some embodiments, the glycineresidue at position 487 is replaced by an amino acid residue withrelatively large side chains. In some embodiments, the glycine residueat position 487 is replaced by a arginine residue (G487R). In someembodiments, the glycine residue at position 487 is replaced by a lysineresidue (G487K). In some embodiments, the glycine residue at position487 is replaced by a tryptophan residue (G487W). In some embodiments,the glycine residue at position 487 is replaced by a tyrosine residue(G487Y).

In some embodiments, the adenosine deaminase comprises a mutation atglutamic acid488 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the glutamicacid residue at position 488 is replaced by a glutamine residue (E488Q).In some embodiments, the glutamic acid residue at position 488 isreplaced by a histidine residue (E488H). In some embodiments, theglutamic acid residue at position 488 is replace by an arginine residue(E488R). In some embodiments, the glutamic acid residue at position 488is replace by a lysine residue (E488K). In some embodiments, theglutamic acid residue at position 488 is replace by an asparagineresidue (E488N). In some embodiments, the glutamic acid residue atposition 488 is replace by an alanine residue (E488A). In someembodiments, the glutamic acid residue at position 488 is replace by aMethionine residue (E488M). In some embodiments, the glutamic acidresidue at position 488 is replace by a serine residue (E488S). In someembodiments, the glutamic acid residue at position 488 is replace by aphenylalanine residue (E488F). In some embodiments, the glutamic acidresidue at position 488 is replace by a lysine residue (E488L). In someembodiments, the glutamic acid residue at position 488 is replace by atryptophan residue (E488W).

In some embodiments, the adenosine deaminase comprises a mutation atthreonine490 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, thethreonine residue at position 490 is replaced by a cysteine residue(T490C). In some embodiments, the threonine residue at position 490 isreplaced by a serine residue (T490S). In some embodiments, the threonineresidue at position 490 is replaced by an alanine residue (T490A). Insome embodiments, the threonine residue at position 490 is replaced by aphenylalanine residue (T490F). In some embodiments, the threonineresidue at position 490 is replaced by a tyrosine residue (T490Y). Insome embodiments, the threonine residue at position 490 is replaced by aserine residue (T490R). In some embodiments, the threonine residue atposition 490 is replaced by an alanine residue (T490K). In someembodiments, the threonine residue at position 490 is replaced by aphenylalanine residue (T490P). In some embodiments, the threonineresidue at position 490 is replaced by a tyrosine residue (T490E).

In some embodiments, the adenosine deaminase comprises a mutation atvaline493 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the valineresidue at position 493 is replaced by an alanine residue (V493A). Insome embodiments, the valine residue at position 493 is replaced by aserine residue (V493S). In some embodiments, the valine residue atposition 493 is replaced by a threonine residue (V493T). In someembodiments, the valine residue at position 493 is replaced by anarginine residue (V493R). In some embodiments, the valine residue atposition 493 is replaced by an aspartic acid residue (V493D). In someembodiments, the valine residue at position 493 is replaced by a prolineresidue (V493P). In some embodiments, the valine residue at position 493is replaced by a glycine residue (V493G).

In some embodiments, the adenosine deaminase comprises a mutation atalanine589 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the alanineresidue at position 589 is replaced by a valine residue (A589V).

In some embodiments, the adenosine deaminase comprises a mutation atasparagine597 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, theasparagine residue at position 597 is replaced by a lysine residue(N597K). In some embodiments, the adenosine deaminase comprises amutation at position 597 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 597 is replaced by an arginine residue(N597R). In some embodiments, the adenosine deaminase comprises amutation at position 597 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 597 is replaced by an alanine residue(N597A). In some embodiments, the adenosine deaminase comprises amutation at position 597 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 597 is replaced by a glutamic acidresidue (N597E). In some embodiments, the adenosine deaminase comprisesa mutation at position 597 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 597 is replaced by a histidine residue(N597H). In some embodiments, the adenosine deaminase comprises amutation at position 597 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 597 is replaced by a glycine residue(N597G). In some embodiments, the adenosine deaminase comprises amutation at position 597 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 597 is replaced by a tyrosine residue(N597Y). In some embodiments, the asparagine residue at position 597 isreplaced by a phenylalanine residue (N597F). In some embodiments, theadenosine deaminase comprises mutation N597I. In some embodiments, theadenosine deaminase comprises mutation N597L. In some embodiments, theadenosine deaminase comprises mutation N597V. In some embodiments, theadenosine deaminase comprises mutation N597M. In some embodiments, theadenosine deaminase comprises mutation N597C. In some embodiments, theadenosine deaminase comprises mutation N597P. In some embodiments, theadenosine deaminase comprises mutation N597T. In some embodiments, theadenosine deaminase comprises mutation N597S. In some embodiments, theadenosine deaminase comprises mutation N597W. In some embodiments, theadenosine deaminase comprises mutation N597Q. In some embodiments, theadenosine deaminase comprises mutation N597D. In certain exampleembodiments, the mutations at N597 described above are further made inthe context of an E488Q background

In some embodiments, the adenosine deaminase comprises a mutation atserine599 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the serineresidue at position 599 is replaced by a threonine residue (S599T).

In some embodiments, the adenosine deaminase comprises a mutation atasparagine613 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, theasparagine residue at position 613 is replaced by a lysine residue(N613K). In some embodiments, the adenosine deaminase comprises amutation at position 613 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 613 is replaced by an arginine residue(N613R). In some embodiments, the adenosine deaminase comprises amutation at position 613 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 613 is replaced by an alanine residue(N613A) In some embodiments, the adenosine deaminase comprises amutation at position 613 of the amino acid sequence, which has anasparagine residue in the wild type sequence. In some embodiments, theasparagine residue at position 613 is replaced by a glutamic acidresidue (N613E). In some embodiments, the adenosine deaminase comprisesmutation N613I. In some embodiments, the adenosine deaminase comprisesmutation N613L. In some embodiments, the adenosine deaminase comprisesmutation N613V. In some embodiments, the adenosine deaminase comprisesmutation N613F. In some embodiments, the adenosine deaminase comprisesmutation N613M. In some embodiments, the adenosine deaminase comprisesmutation N613C. In some embodiments, the adenosine deaminase comprisesmutation N613G. In some embodiments, the adenosine deaminase comprisesmutation N613P. In some embodiments, the adenosine deaminase comprisesmutation N613T. In some embodiments, the adenosine deaminase comprisesmutation N613S. In some embodiments, the adenosine deaminase comprisesmutation N613Y. In some embodiments, the adenosine deaminase comprisesmutation N613W. In some embodiments, the adenosine deaminase comprisesmutation N613Q. In some embodiments, the adenosine deaminase comprisesmutation N613H. In some embodiments, the adenosine deaminase comprisesmutation N613D. In some embodiments, the mutations at N613 describedabove are further made in combination with a E488Q mutation.

In some embodiments, to improve editing efficiency, the adenosinedeaminase may comprise one or more of the mutations: G336D, G487A,G487V, E488Q, E488H, E488R, E488N, E488A, E488S, E488M, T490C, T490S,V493T, V493S, V493A, V493R, V493D, V493P, V493G, N597K, N597R, N597A,N597E, N597H, N597G, N597Y, A589V, S599T, N613K, N613R, N613A, N613E,based on amino acid sequence positions of hADAR2-D, and mutations in ahomologous ADAR protein corresponding to the above.

In some embodiments, to reduce editing efficiency, the adenosinedeaminase may comprise one or more of the mutations: E488F, E488L,E488W, T490A, T490F, T490Y, T490R, T490K, T490P, T490E, N597F, based onamino acid sequence positions of hADAR2-D, and mutations in a homologousADAR protein corresponding to the above. In particular embodiments, itcan be of interest to use an adenosine deaminase enzyme with reducedefficacy to reduce off-target effects.

In some embodiments, to reduce off-target effects, the adenosinedeaminase comprises one or more of mutations at R348, V351, T375, K376,E396, C451, R455, N473, R474, K475, R477, R481, S486, E488, T490, 5495,R510, based on amino acid sequence positions of hADAR2-D, and mutationsin a homologous ADAR protein corresponding to the above. In someembodiments, the adenosine deaminase comprises mutation at E488 and oneor more additional positions selected from R348, V351, T375, K376, E396,C451, R455, N473, R474, K475, R477, R481, S486, T490, S495, R510. Insome embodiments, the adenosine deaminase comprises mutation at T375,and optionally at one or more additional positions. In some embodiments,the adenosine deaminase comprises mutation at N473, and optionally atone or more additional positions. In some embodiments, the adenosinedeaminase comprises mutation at V351, and optionally at one or moreadditional positions. In some embodiments, the adenosine deaminasecomprises mutation at E488 and T375, and optionally at one or moreadditional positions. In some embodiments, the adenosine deaminasecomprises mutation at E488 and N473, and optionally at one or moreadditional positions. In some embodiments, the adenosine deaminasecomprises mutation E488 and V351, and optionally at one or moreadditional positions. In some embodiments, the adenosine deaminasecomprises mutation at E488 and one or more of T375, N473, and V351.

In some embodiments, to reduce off-target effects, the adenosinedeaminase comprises one or more of mutations selected from R348E, V351L,T375G, T375S, R455G, R455S, R455E, N473D, R474E, K475Q, R477E, R481E,S486T, E488Q, T490A, T490S, S495T, and R510E, based on amino acidsequence positions of hADAR2-D, and mutations in a homologous ADARprotein corresponding to the above. In some embodiments, the adenosinedeaminase comprises mutation E488Q and one or more additional mutationsselected from R348E, V351L, T375G, T375S, R455G, R455S, R455E, N473D,R474E, K475Q, R477E, R481E, S486T, T490A, T490S, S495T, and R510E. Insome embodiments, the adenosine deaminase comprises mutation T375G orT375S, and optionally one or more additional mutations. In someembodiments, the adenosine deaminase comprises mutation N473D, andoptionally one or more additional mutations. In some embodiments, theadenosine deaminase comprises mutation V351L, and optionally one or moreadditional mutations. In some embodiments, the adenosine deaminasecomprises mutation E488Q, and T375G or T375G, and optionally one or moreadditional mutations. In some embodiments, the adenosine deaminasecomprises mutation E488Q and N473D, and optionally one or moreadditional mutations. In some embodiments, the adenosine deaminasecomprises mutation E488Q and V351L, and optionally one or moreadditional mutations. In some embodiments, the adenosine deaminasecomprises mutation E488Q and one or more of T375G/S, N473D and V351L.

In certain examples, the adenosine deaminase protein or catalytic domainthereof has been modified to comprise a mutation at E488, preferablyE488Q, of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein and/or wherein the adenosine deaminaseprotein or catalytic domain thereof has been modified to comprise amutation at T375, preferably T375G of the hADAR2-D amino acid sequence,or a corresponding position in a homologous ADAR protein. In certainexamples, the adenosine deaminase protein or catalytic domain thereofhas been modified to comprise a mutation at E1008, preferably E1008Q, ofthe hADAR1d amino acid sequence, or a corresponding position in ahomologous ADAR protein.

Crystal structures of the human ADAR2 deaminase domain bound to duplexRNA reveal a protein loop that binds the RNA on the 5′ side of themodification site. This 5′ binding loop is one contributor to substratespecificity differences between ADAR family members. See Wang et al.,Nucleic Acids Res., 44(20):9872-9880 (2016), the content of which isincorporated herein by reference in its entirety. In addition, anADAR2-specific RNA-binding loop was identified near the enzyme activesite. See Mathews et al., Nat. Struct. Mol. Biol., 23(5):426-33 (2016),the content of which is incorporated herein by reference in itsentirety. In some embodiments, the adenosine deaminase comprises one ormore mutations in the RNA binding loop to improve editing specificityand/or efficiency.

In some embodiments, the adenosine deaminase comprises a mutation atalanine454 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the alanineresidue at position 454 is replaced by a serine residue (A454S). In someembodiments, the alanine residue at position 454 is replaced by acysteine residue (A454C). In some embodiments, the alanine residue atposition 454 is replaced by an aspartic acid residue (A454D).

In some embodiments, the adenosine deaminase comprises a mutation atarginine455 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the arginineresidue at position 455 is replaced by an alanine residue (R455A). Insome embodiments, the arginine residue at position 455 is replaced by avaline residue (R455V). In some embodiments, the arginine residue atposition 455 is replaced by a histidine residue (R455H). In someembodiments, the arginine residue at position 455 is replaced by aglycine residue (R455G). In some embodiments, the arginine residue atposition 455 is replaced by a serine residue (R455S). In someembodiments, the arginine residue at position 455 is replaced by aglutamic acid residue (R455E). In some embodiments, the adenosinedeaminase comprises mutation R455C. In some embodiments, the adenosinedeaminase comprises mutation R455I. In some embodiments, the adenosinedeaminase comprises mutation R455K. In some embodiments, the adenosinedeaminase comprises mutation R455L. In some embodiments, the adenosinedeaminase comprises mutation R455M. In some embodiments, the adenosinedeaminase comprises mutation R455N. In some embodiments, the adenosinedeaminase comprises mutation R455Q. In some embodiments, the adenosinedeaminase comprises mutation R455F. In some embodiments, the adenosinedeaminase comprises mutation R455W. In some embodiments, the adenosinedeaminase comprises mutation R455P. In some embodiments, the adenosinedeaminase comprises mutation R455Y. In some embodiments, the adenosinedeaminase comprises mutation R455E. In some embodiments, the adenosinedeaminase comprises mutation R455D. In some embodiments, the mutationsat R455 described above are further made in combination with a E488Qmutation.

In some embodiments, the adenosine deaminase comprises a mutation atisoleucine456 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, theisoleucine residue at position 456 is replaced by a valine residue(I456V). In some embodiments, the isoleucine residue at position 456 isreplaced by a leucine residue (I456L). In some embodiments, theisoleucine residue at position 456 is replaced by an aspartic acidresidue (I456D).

In some embodiments, the adenosine deaminase comprises a mutation atphenylalanine457 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, thephenylalanine residue at position 457 is replaced by a tyrosine residue(F457Y). In some embodiments, the phenylalanine residue at position 457is replaced by an arginine residue (F457R). In some embodiments, thephenylalanine residue at position 457 is replaced by a glutamic acidresidue (F457E).

In some embodiments, the adenosine deaminase comprises a mutation atserine458 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the serineresidue at position 458 is replaced by a valine residue (S458V). In someembodiments, the serine residue at position 458 is replaced by aphenylalanine residue (S458F). In some embodiments, the serine residueat position 458 is replaced by a proline residue (S458P). In someembodiments, the adenosine deaminase comprises mutation S458I. In someembodiments, the adenosine deaminase comprises mutation S458L. In someembodiments, the adenosine deaminase comprises mutation S458M. In someembodiments, the adenosine deaminase comprises mutation S458C. In someembodiments, the adenosine deaminase comprises mutation S458A. In someembodiments, the adenosine deaminase comprises mutation S458G. In someembodiments, the adenosine deaminase comprises mutation S458T. In someembodiments, the adenosine deaminase comprises mutation S458Y. In someembodiments, the adenosine deaminase comprises mutation S458W. In someembodiments, the adenosine deaminase comprises mutation S458Q. In someembodiments, the adenosine deaminase comprises mutation S458N. In someembodiments, the adenosine deaminase comprises mutation S458H. In someembodiments, the adenosine deaminase comprises mutation S458E. In someembodiments, the adenosine deaminase comprises mutation S458D. In someembodiments, the adenosine deaminase comprises mutation S458K. In someembodiments, the adenosine deaminase comprises mutation S458R. In someembodiments, the mutations at 5458 described above are further made incombination with a E488Q mutation.

In some embodiments, the adenosine deaminase comprises a mutation atproline459 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the prolineresidue at position 459 is replaced by a cysteine residue (P459C). Insome embodiments, the proline residue at position 459 is replaced by ahistidine residue (P459H). In some embodiments, the proline residue atposition 459 is replaced by a tryptophan residue (P459W).

In some embodiments, the adenosine deaminase comprises a mutation athistidine460 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, thehistidine residue at position 460 is replaced by an arginine residue(H460R). In some embodiments, the histidine residue at position 460 isreplaced by an isoleucine residue (H460I). In some embodiments, thehistidine residue at position 460 is replaced by a proline residue(H460P). In some embodiments, the adenosine deaminase comprises mutationH460L. In some embodiments, the adenosine deaminase comprises mutationH460V. In some embodiments, the adenosine deaminase comprises mutationH460F. In some embodiments, the adenosine deaminase comprises mutationH460M. In some embodiments, the adenosine deaminase comprises mutationH460C. In some embodiments, the adenosine deaminase comprises mutationH460A. In some embodiments, the adenosine deaminase comprises mutationH460G. In some embodiments, the adenosine deaminase comprises mutationH460T. In some embodiments, the adenosine deaminase comprises mutationH460S. In some embodiments, the adenosine deaminase comprises mutationH460Y. In some embodiments, the adenosine deaminase comprises mutationH460W. In some embodiments, the adenosine deaminase comprises mutationH460Q. In some embodiments, the adenosine deaminase comprises mutationH460N. In some embodiments, the adenosine deaminase comprises mutationH460E. In some embodiments, the adenosine deaminase comprises mutationH460D. In some embodiments, the adenosine deaminase comprises mutationH460K. In some embodiments, the mutations at H460 described above arefurther made in combination with a E488Q mutation.

In some embodiments, the adenosine deaminase comprises a mutation atproline462 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the prolineresidue at position 462 is replaced by a serine residue (P462S). In someembodiments, the proline residue at position 462 is replaced by atryptophan residue (P462W). In some embodiments, the proline residue atposition 462 is replaced by a glutamic acid residue (P462E).

In some embodiments, the adenosine deaminase comprises a mutation ataspartic acid469 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the asparticacid residue at position 469 is replaced by a glutamine residue (D469Q).In some embodiments, the aspartic acid residue at position 469 isreplaced by a serine residue (D469S). In some embodiments, the asparticacid residue at position 469 is replaced by a tyrosine residue (D469Y).

In some embodiments, the adenosine deaminase comprises a mutation atarginine470 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the arginineresidue at position 470 is replaced by an alanine residue (R470A). Insome embodiments, the arginine residue at position 470 is replaced by anisoleucine residue (R470I). In some embodiments, the arginine residue atposition 470 is replaced by an aspartic acid residue (R470D).

In some embodiments, the adenosine deaminase comprises a mutation athistidine471 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, thehistidine residue at position 471 is replaced by a lysine residue(H471K). In some embodiments, the histidine residue at position 471 isreplaced by a threonine residue (H471T). In some embodiments, thehistidine residue at position 471 is replaced by a valine residue(H471V).

In some embodiments, the adenosine deaminase comprises a mutation atproline472 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the prolineresidue at position 472 is replaced by a lysine residue (P472K). In someembodiments, the proline residue at position 472 is replaced by athreonine residue (P472T). In some embodiments, the proline residue atposition 472 is replaced by an aspartic acid residue (P472D).

In some embodiments, the adenosine deaminase comprises a mutation atasparagine473 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, theasparagine residue at position 473 is replaced by an arginine residue(N473R). In some embodiments, the asparagine residue at position 473 isreplaced by a tryptophan residue (N473W). In some embodiments, theasparagine residue at position 473 is replaced by a proline residue(N473P). In some embodiments, the asparagine residue at position 473 isreplaced by an aspartic acid residue (N473D).

In some embodiments, the adenosine deaminase comprises a mutation atarginine 474 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the arginineresidue at position 474 is replaced by a lysine residue (R474K). In someembodiments, the arginine residue at position 474 is replaced by aglycine residue (R474G). In some embodiments, the arginine residue atposition 474 is replaced by an aspartic acid residue (R474D). In someembodiments, the arginine residue at position 474 is replaced by aglutamic acid residue (R474E).

In some embodiments, the adenosine deaminase comprises a mutation atlysine475 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the lysineresidue at position 475 is replaced by a glutamine residue (K475Q). Insome embodiments, the lysine residue at position 475 is replaced by anasparagine residue (K475N). In some embodiments, the lysine residue atposition 475 is replaced by an aspartic acid residue (K475D).

In some embodiments, the adenosine deaminase comprises a mutation atalanine476 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the alanineresidue at position 476 is replaced by a serine residue (A476S). In someembodiments, the alanine residue at position 476 is replaced by anarginine residue (A476R). In some embodiments, the alanine residue atposition 476 is replaced by a glutamic acid residue (A476E).

In some embodiments, the adenosine deaminase comprises a mutation atarginine477 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the arginineresidue at position 477 is replaced by a lysine residue (R477K). In someembodiments, the arginine residue at position 477 is replaced by athreonine residue (R477T). In some embodiments, the arginine residue atposition 477 is replaced by a phenylalanine residue (R477F). In someembodiments, the arginine residue at position 474 is replaced by aglutamic acid residue (R477E).

In some embodiments, the adenosine deaminase comprises a mutation atglycine478 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the glycineresidue at position 478 is replaced by an alanine residue (G478A). Insome embodiments, the glycine residue at position 478 is replaced by anarginine residue (G478R). In some embodiments, the glycine residue atposition 478 is replaced by a tyrosine residue (G478Y). In someembodiments, the adenosine deaminase comprises mutation G478I. In someembodiments, the adenosine deaminase comprises mutation G478L. In someembodiments, the adenosine deaminase comprises mutation G478V. In someembodiments, the adenosine deaminase comprises mutation G478F. In someembodiments, the adenosine deaminase comprises mutation G478M. In someembodiments, the adenosine deaminase comprises mutation G478C. In someembodiments, the adenosine deaminase comprises mutation G478P. In someembodiments, the adenosine deaminase comprises mutation G478T. In someembodiments, the adenosine deaminase comprises mutation G478S. In someembodiments, the adenosine deaminase comprises mutation G478W. In someembodiments, the adenosine deaminase comprises mutation G478Q. In someembodiments, the adenosine deaminase comprises mutation G478N. In someembodiments, the adenosine deaminase comprises mutation G478H. In someembodiments, the adenosine deaminase comprises mutation G478E. In someembodiments, the adenosine deaminase comprises mutation G478D. In someembodiments, the adenosine deaminase comprises mutation G478K. In someembodiments, the mutations at G478 described above are further made incombination with a E488Q mutation.

In some embodiments, the adenosine deaminase comprises a mutation atglutamine479 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, theglutamine residue at position 479 is replaced by an asparagine residue(Q479N). In some embodiments, the glutamine residue at position 479 isreplaced by a serine residue (Q479S). In some embodiments, the glutamineresidue at position 479 is replaced by a proline residue (Q479P).

In some embodiments, the adenosine deaminase comprises a mutation atarginine348 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the arginineresidue at position 348 is replaced by an alanine residue (R348A). Insome embodiments, the arginine residue at position 348 is replaced by aglutamic acid residue (R348E).

In some embodiments, the adenosine deaminase comprises a mutation atvaline351 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the valineresidue at position 351 is replaced by a leucine residue (V351L). Insome embodiments, the adenosine deaminase comprises mutation V351Y. Insome embodiments, the adenosine deaminase comprises mutation V351M. Insome embodiments, the adenosine deaminase comprises mutation V351T. Insome embodiments, the adenosine deaminase comprises mutation V351G. Insome embodiments, the adenosine deaminase comprises mutation V351A. Insome embodiments, the adenosine deaminase comprises mutation V351F. Insome embodiments, the adenosine deaminase comprises mutation V351E. Insome embodiments, the adenosine deaminase comprises mutation V351I. Insome embodiments, the adenosine deaminase comprises mutation V351C. Insome embodiments, the adenosine deaminase comprises mutation V351H. Insome embodiments, the adenosine deaminase comprises mutation V351P. Insome embodiments, the adenosine deaminase comprises mutation V351S. Insome embodiments, the adenosine deaminase comprises mutation V351K. Insome embodiments, the adenosine deaminase comprises mutation V351N. Insome embodiments, the adenosine deaminase comprises mutation V351W. Insome embodiments, the adenosine deaminase comprises mutation V351Q. Insome embodiments, the adenosine deaminase comprises mutation V351D. Insome embodiments, the adenosine deaminase comprises mutation V351R. Insome embodiments, the mutations at V351 described above are further madein combination with a E488Q mutation.

In some embodiments, the adenosine deaminase comprises a mutation atthreonine375 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, thethreonine residue at position 375 is replaced by a glycine residue(T375G). In some embodiments, the threonine residue at position 375 isreplaced by a serine residue (T375S). In some embodiments, the adenosinedeaminase comprises mutation T375H. In some embodiments, the adenosinedeaminase comprises mutation T375Q. In some embodiments, the adenosinedeaminase comprises mutation T375C. In some embodiments, the adenosinedeaminase comprises mutation T375N. In some embodiments, the adenosinedeaminase comprises mutation T375M. In some embodiments, the adenosinedeaminase comprises mutation T375A. In some embodiments, the adenosinedeaminase comprises mutation T375W. In some embodiments, the adenosinedeaminase comprises mutation T375V. In some embodiments, the adenosinedeaminase comprises mutation T375R. In some embodiments, the adenosinedeaminase comprises mutation T375E. In some embodiments, the adenosinedeaminase comprises mutation T375K. In some embodiments, the adenosinedeaminase comprises mutation T375F. In some embodiments, the adenosinedeaminase comprises mutation T375I. In some embodiments, the adenosinedeaminase comprises mutation T375D. In some embodiments, the adenosinedeaminase comprises mutation T375P. In some embodiments, the adenosinedeaminase comprises mutation T375L. In some embodiments, the adenosinedeaminase comprises mutation T375Y. In some embodiments, the mutationsat T375Y described above are further made in combination with an E488Qmutation.

In some embodiments, the adenosine deaminase comprises a mutation atArg481 of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein. In some embodiments, the arginine residueat position 481 is replaced by a glutamic acid residue (R481E).

In some embodiments, the adenosine deaminase comprises a mutation atSer486 of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein. In some embodiments, the serine residue atposition 486 is replaced by a threonine residue (S486T).

In some embodiments, the adenosine deaminase comprises a mutation atThr490 of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein. In some embodiments, the threonine residueat position 490 is replaced by an alanine residue (T490A). In someembodiments, the threonine residue at position 490 is replaced by aserine residue (T490S).

In some embodiments, the adenosine deaminase comprises a mutation atSer495 of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein. In some embodiments, the serine residue atposition 495 is replaced by a threonine residue (S495T).

In some embodiments, the adenosine deaminase comprises a mutation atArg510 of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein. In some embodiments, the arginine residueat position 510 is replaced by a glutamine residue (R510Q). In someembodiments, the arginine residue at position 510 is replaced by analanine residue (R510A). In some embodiments, the arginine residue atposition 510 is replaced by a glutamic acid residue (R510E).

In some embodiments, the adenosine deaminase comprises a mutation atGly593 of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein. In some embodiments, the glycine residueat position 593 is replaced by an alanine residue (G593A). In someembodiments, the glycine residue at position 593 is replaced by aglutamic acid residue (G593E).

In some embodiments, the adenosine deaminase comprises a mutation atLys594 of the hADAR2-D amino acid sequence, or a corresponding positionin a homologous ADAR protein. In some embodiments, the lysine residue atposition 594 is replaced by an alanine residue (K594A).

In some embodiments, the adenosine deaminase comprises a mutation at anyone or more of positions A454, R455, 1456, F457, 5458, P459, H460, P462,D469, R470, H471, P472, N473, R474, K475, A476, R477, G478, Q479, R348,R510, G593, K594 of the hADAR2-D amino acid sequence, or a correspondingposition in a homologous ADAR protein.

In some embodiments, the adenosine deaminase comprises any one or moreof mutations A454S, A454C, A454D, R455A, R455V, R455H, I456V, I456L,I456D, F457Y, F457R, F457E, S458V, S458F, S458P, P459C, P459H, P459W,H460R, H460I, H460P, P462S, P462W, P462E, D469Q, D469S, D469Y, R470A,R470I, R470D, H471K, H471T, H471V, P472K, P472T, P472D, N473R, N473W,N473P, R474K, R474G, R474D, K475Q, K475N, K475D, A476S, A476R, A476E,R477K, R477T, R477F, G478A, G478R, G478Y, Q479N, Q479S, Q479P, R348A,R510Q, R510A, G593A, G593E, K594A of the hADAR2-D amino acid sequence,or a corresponding position in a homologous ADAR protein.

In some embodiments, the adenosine deaminase comprises a mutation at anyone or more of positions T375, V351, G478, 5458, H460 of the hADAR2-Damino acid sequence, or a corresponding position in a homologous ADARprotein, optionally in combination a mutation at E488. In someembodiments, the adenosine deaminase comprises one or more of mutationsselected from T375G, T375C, T375H, T375Q, V351M, V351T, V351Y, G478R,S458F, H460I, optionally in combination with E488Q.

In some embodiments, the adenosine deaminase comprises one or more ofmutations selected from T375H, T375Q, V351M, V351Y, H460P, optionally incombination with E488Q.

In some embodiments, the adenosine deaminase comprises mutations T375Sand S458F, optionally in combination with E488Q.

In some embodiments, the adenosine deaminase comprises a mutation at twoor more of positions T375, N473, R474, G478, S458, P459, V351, R455,R455, T490, R348, Q479 of the hADAR2-D amino acid sequence, or acorresponding position in a homologous ADAR protein, optionally incombination a mutation at E488. In some embodiments, the adenosinedeaminase comprises two or more of mutations selected from T375G, T375S,N473D, R474E, G478R, S458F, P459W, V351L, R455G, R455S, T490A, R348E,Q479P, optionally in combination with E488Q.

In some embodiments, the adenosine deaminase comprises mutations T375Gand V351L. In some embodiments, the adenosine deaminase comprisesmutations T375G and R455G. In some embodiments, the adenosine deaminasecomprises mutations T375G and R455S. In some embodiments, the adenosinedeaminase comprises mutations T375G and T490A. In some embodiments, theadenosine deaminase comprises mutations T375G and R348E. In someembodiments, the adenosine deaminase comprises mutations T375S andV351L. In some embodiments, the adenosine deaminase comprises mutationsT375S and R455G. In some embodiments, the adenosine deaminase comprisesmutations T375S and R455S. In some embodiments, the adenosine deaminasecomprises mutations T375S and T490A. In some embodiments, the adenosinedeaminase comprises mutations T375S and R348E. In some embodiments, theadenosine deaminase comprises mutations N473D and V351L. In someembodiments, the adenosine deaminase comprises mutations N473D andR455G. In some embodiments, the adenosine deaminase comprises mutationsN473D and R455S. In some embodiments, the adenosine deaminase comprisesmutations N473D and T490A. In some embodiments, the adenosine deaminasecomprises mutations N473D and R348E. In some embodiments, the adenosinedeaminase comprises mutations R474E and V351L. In some embodiments, theadenosine deaminase comprises mutations R474E and R455G. In someembodiments, the adenosine deaminase comprises mutations R474E andR455S. In some embodiments, the adenosine deaminase comprises mutationsR474E and T490A. In some embodiments, the adenosine deaminase comprisesmutations R474E and R348E. In some embodiments, the adenosine deaminasecomprises mutations S458F and T375G. In some embodiments, the adenosinedeaminase comprises mutations S458F and T375S. In some embodiments, theadenosine deaminase comprises mutations S458F and N473D. In someembodiments, the adenosine deaminase comprises mutations S458F andR474E. In some embodiments, the adenosine deaminase comprises mutationsS458F and G478R. In some embodiments, the adenosine deaminase comprisesmutations G478R and T375G. In some embodiments, the adenosine deaminasecomprises mutations G478R and T375S. In some embodiments, the adenosinedeaminase comprises mutations G478R and N473D. In some embodiments, theadenosine deaminase comprises mutations G478R and R474E. In someembodiments, the adenosine deaminase comprises mutations P459W andT375G. In some embodiments, the adenosine deaminase comprises mutationsP459W and T375S. In some embodiments, the adenosine deaminase comprisesmutations P459W and N473D. In some embodiments, the adenosine deaminasecomprises mutations P459W and R474E. In some embodiments, the adenosinedeaminase comprises mutations P459W and G478R. In some embodiments, theadenosine deaminase comprises mutations P459W and S458F. In someembodiments, the adenosine deaminase comprises mutations Q479P andT375G. In some embodiments, the adenosine deaminase comprises mutationsQ479P and T375S. In some embodiments, the adenosine deaminase comprisesmutations Q479P and N473D. In some embodiments, the adenosine deaminasecomprises mutations Q479P and R474E. In some embodiments, the adenosinedeaminase comprises mutations Q479P and G478R. In some embodiments, theadenosine deaminase comprises mutations Q479P and S458F. In someembodiments, the adenosine deaminase comprises mutations Q479P andP459W. All mutations described in this paragraph may also further bemade in combination with a E488Q mutations.

In some embodiments, the adenosine deaminase comprises a mutation at anyone or more of positions K475, Q479, P459, G478, S458 of the hADAR2-Damino acid sequence, or a corresponding position in a homologous ADARprotein, optionally in combination a mutation at E488. In someembodiments, the adenosine deaminase comprises one or more of mutationsselected from K475N, Q479N, P459W, G478R, S458P, S458F, optionally incombination with E488Q.

In some embodiments, the adenosine deaminase comprises a mutation at anyone or more of positions T375, V351, R455, H460, A476 of the hADAR2-Damino acid sequence, or a corresponding position in a homologous ADARprotein, optionally in combination a mutation at E488. In someembodiments, the adenosine deaminase comprises one or more of mutationsselected from T375G, T375C, T375H, T375Q, V351M, V351T, V351Y, R455H,H460P, H460I, A476E, optionally in combination with E488Q.

In certain embodiments, improvement of editing and reduction ofoff-target modification is achieved by chemical modification of gRNAs.gRNAs which are chemically modified as exemplified in Vogel et al.(2014), Angew Chem Int Ed, 53:6267-6271, doi:10.1002/anie.201402634(incorporated herein by reference in its entirety) reduce off-targetactivity and improve on-target efficiency. 2′-O-methyl andphosphothioate modified guide RNAs in general improve editing efficiencyin cells.

ADAR has been known to demonstrate a preference for neighboringnucleotides on either side of the edited A(www.nature.com/nsmb/journal/v23/n5/full/nsmb.3203.html, Matthews et al.(2017), Nature Structural Mol Biol, 23(5): 426-433, incorporated hereinby reference in its entirety). Accordingly, in certain embodiments, thegRNA, target, and/or ADAR is selected optimized for motif preference.

Intentional mismatches have been demonstrated in vitro to allow forediting of non-preferred motifs(academic.oup.com/nar/article-lookup/doi/10.1093/nar/gku272; Schneideret al (2014), Nucleic Acid Res, 42(10):e87); Fukuda et al. (2017),Scientific Reports, 7, doi:10.1038/srep41478, incorporated herein byreference in its entirety). Accordingly, in certain embodiments, toenhance RNA editing efficiency on non-preferred 5′ or 3′ neighboringbases, intentional mismatches in neighboring bases are introduced.

In some embodiments, the adenosine deaminase may be a tRNA-specificadenosine deaminase or a variant thereof. In some embodiments, theadenosine deaminase may comprise one or more of the mutations: W23L,W23R, R26G, H36L, N37S, P48S, P48T, P48A, I49V, R51L, N72D, L84F, S97C,A106V, D108N, H123Y, G125A, A142N, S146C, D147Y, R152H, R152P, E155V,I156F, K157N, K161T, based on amino acid sequence positions of E. coliTadA, and mutations in a homologous deaminase protein corresponding tothe above. In some embodiments, the adenosine deaminase may comprise oneor more of the mutations: D108N based on amino acid sequence positionsof E. coli TadA, and mutations in a homologous deaminase proteincorresponding to the above. In some embodiments, the adenosine deaminasemay comprise one or more of the mutations: A106V, D108N, based on aminoacid sequence positions of E. coli TadA, and mutations in a homologousdeaminase protein corresponding to the above. In some embodiments, theadenosine deaminase may comprise one or more of the mutations: A106V,D108N, D147Y, E155V, based on amino acid sequence positions of E. coliTadA, and mutations in a homologous deaminase protein corresponding tothe above. In some embodiments, the adenosine deaminase may comprise oneor more of the mutations: A106V, D108N, based on amino acid sequencepositions of E. coli TadA, and mutations in a homologous deaminaseprotein corresponding to the above. In some embodiments, the adenosinedeaminase may comprise one or more of the mutations: A106V, D108N,D147Y, E155V, L84F, H123Y, I156F, based on amino acid sequence positionsof E. coli TadA, and mutations in a homologous deaminase proteincorresponding to the above. In some embodiments, the adenosine deaminasemay comprise one or more of the mutations: A106V, D108N, D147Y, E155V,L84F, H123Y, I156F, A142N, based on amino acid sequence positions of E.coli TadA, and mutations in a homologous deaminase protein correspondingto the above. In some embodiments, the adenosine deaminase may compriseone or more of the mutations: A106V, D108N, D147Y, E155V, L84F, H123Y,I156F, H36L, R51L, S146C, K157N, based on amino acid sequence positionsof E. coli TadA, and mutations in a homologous deaminase proteincorresponding to the above. In some embodiments, the adenosine deaminasemay comprise one or more of the mutations: A106V, D108N, D147Y, E155V,L84F, H123Y, I156F, H36L, R51L, S146C, K157N, P48S, based on amino acidsequence positions of E. coli TadA, and mutations in a homologousdeaminase protein corresponding to the above. In some embodiments, theadenosine deaminase may comprise one or more of the mutations: A106V,D108N, D147Y, E155V, L84F, H123Y, I156F, H36L, R51L, S146C, K157N, P48S,A142N, based on amino acid sequence positions of E. coli TadA, andmutations in a homologous deaminase protein corresponding to the above.In some embodiments, the adenosine deaminase may comprise one or more ofthe mutations: A106V, D108N, D147Y, E155V, L84F, H123Y, I156F, H36L,R51L, S146C, K157N, P48S, W23R, P48A, based on amino acid sequencepositions of E. coli TadA, and mutations in a homologous deaminaseprotein corresponding to the above. In some embodiments, the adenosinedeaminase may comprise one or more of the mutations: A106V, D108N,D147Y, E155V, L84F, H123Y, I156F, H36L, R51L, S146C, K157N, P48S, W23R,P48A, A142N, based on amino acid sequence positions of E. coli TadA, andmutations in a homologous deaminase protein corresponding to the above.In some embodiments, the adenosine deaminase may comprise one or more ofthe mutations: A106V, D108N, D147Y, E155V, L84F, H123Y, I156F, H36L,R51L, S146C, K157N, P48S, W23R, P48A, R152P, based on amino acidsequence positions of E. coli TadA, and mutations in a homologousdeaminase protein corresponding to the above. In some embodiments, theadenosine deaminase may comprise one or more of the mutations: A106V,D108N, D147Y, E155V, L84F, H123Y, I156F, H36L, R51L, S146C, K157N, P48S,W23R, P48A, R152P, A142N, based on amino acid sequence positions of E.coli TadA, and mutations in a homologous deaminase protein correspondingto the above.

Results suggest that A's opposite C's in the targeting window of theADAR deaminase domain are preferentially edited over other bases.Additionally, A's base-paired with U's within a few bases of thetargeted base show low levels of editing by CRISPR-Cas-ADAR fusions,suggesting that there is flexibility for the enzyme to edit multipleA's. These two observations suggest that multiple A's in the activitywindow of CRISPR-Cas-ADAR fusions could be specified for editing bymismatching all A's to be edited with C's. Accordingly, in certainembodiments, multiple A:C mismatches in the activity window are designedto create multiple A:I edits. In certain embodiments, to suppresspotential off-target editing in the activity window, non-target A's arepaired with A's or G's.

The terms “editing specificity” and “editing preference” are usedinterchangeably herein to refer to the extent of A-to-I editing at aparticular adenosine site in a double-stranded substrate. In someembodiment, the substrate editing preference is determined by the 5′nearest neighbor and/or the 3′ nearest neighbor of the target adenosineresidue. In some embodiments, the adenosine deaminase has preference forthe 5′ nearest neighbor of the substrate ranked as U>A>C>G (“>”indicates greater preference). In some embodiments, the adenosinedeaminase has preference for the 3′ nearest neighbor of the substrateranked as G>C˜A>U (“>” indicates greater preference; “˜” indicatessimilar preference). In some embodiments, the adenosine deaminase haspreference for the 3′ nearest neighbor of the substrate ranked asG>C>U˜A (“>” indicates greater preference; “˜” indicates similarpreference). In some embodiments, the adenosine deaminase has preferencefor the 3′ nearest neighbor of the substrate ranked as G>C>A>U (“>”indicates greater preference). In some embodiments, the adenosinedeaminase has preference for the 3′ nearest neighbor of the substrateranked as C˜G˜A>U (“>” indicates greater preference; “˜” indicatessimilar preference). In some embodiments, the adenosine deaminase haspreference for a triplet sequence containing the target adenosineresidue ranked as TAG>AAG>CAC>AAT>GAA>GAC (“>” indicates greaterpreference), the center A being the target adenosine residue.

In some embodiments, the substrate editing preference of an adenosinedeaminase is affected by the presence or absence of a nucleic acidbinding domain in the adenosine deaminase protein. In some embodiments,to modify substrate editing preference, the deaminase domain isconnected with a double-strand RNA binding domain (dsRBD) or adouble-strand RNA binding motif (dsRBM). In some embodiments, the dsRBDor dsRBM may be derived from an ADAR protein, such as hADAR1 or hADAR2.In some embodiments, a full length ADAR protein that comprises at leastone dsRBD and a deaminase domain is used. In some embodiments, the oneor more dsRBM or dsRBD is at the N-terminus of the deaminase domain. Inother embodiments, the one or more dsRBM or dsRBD is at the C-terminusof the deaminase domain.

In some embodiments, the substrate editing preference of an adenosinedeaminase is affected by amino acid residues near or in the activecenter of the enzyme. In some embodiments, to modify substrate editingpreference, the adenosine deaminase may comprise one or more of themutations: G336D, G487R, G487K, G487W, G487Y, E488Q, E488N, T490A,V493A, V493T, V493S, N597K, N597R, A589V, S599T, N613K, N613R, based onamino acid sequence positions of hADAR2-D, and mutations in a homologousADAR protein corresponding to the above.

Particularly, in some embodiments, to reduce editing specificity, theadenosine deaminase can comprise one or more of mutations E488Q, V493A,N597K, N613K, based on amino acid sequence positions of hADAR2-D, andmutations in a homologous ADAR protein corresponding to the above. Insome embodiments, to increase editing specificity, the adenosinedeaminase can comprise mutation T490A.

In some embodiments, to increase editing preference for target adenosine(A) with an immediate 5′ G, such as substrates comprising the tripletsequence GAC, the center A being the target adenosine residue, theadenosine deaminase can comprise one or more of mutations G336D, E488Q,E488N, V493T, V493S, V493A, A589V, N597K, N597R, S599T, N613K, N613R,based on amino acid sequence positions of hADAR2-D, and mutations in ahomologous ADAR protein corresponding to the above.

Particularly, in some embodiments, the adenosine deaminase comprisesmutation E488Q or a corresponding mutation in a homologous ADAR proteinfor editing substrates comprising the following triplet sequences: GAC,GAA, GAU, GAG, CAU, AAU, UAC, the center A being the target adenosineresidue.

In some embodiments, the adenosine deaminase comprises the wild-typeamino acid sequence of hADAR1-D. In some embodiments, the adenosinedeaminase comprises one or more mutations in the hADAR1-D sequence, suchthat the editing efficiency, and/or substrate editing preference ofhADAR1-D is changed according to specific needs.

In some embodiments, the adenosine deaminase comprises a mutation atGlycine1007 of the hADAR1-D amino acid sequence, or a correspondingposition in a homologous ADAR protein. In some embodiments, the glycineresidue at position 1007 is replaced by a non-polar amino acid residuewith relatively small side chains. For example, in some embodiments, theglycine residue at position 1007 is replaced by an alanine residue(G1007A). In some embodiments, the glycine residue at position 1007 isreplaced by a valine residue (G1007V). In some embodiments, the glycineresidue at position 1007 is replaced by an amino acid residue withrelatively large side chains. In some embodiments, the glycine residueat position 1007 is replaced by an arginine residue (G1007R). In someembodiments, the glycine residue at position 1007 is replaced by alysine residue (G1007K). In some embodiments, the glycine residue atposition 1007 is replaced by a tryptophan residue (G1007W). In someembodiments, the glycine residue at position 1007 is replaced by atyrosine residue (G1007Y). Additionally, in other embodiments, theglycine residue at position 1007 is replaced by a leucine residue(G1007L). In other embodiments, the glycine residue at position 1007 isreplaced by a threonine residue (G1007T). In other embodiments, theglycine residue at position 1007 is replaced by a serine residue(G1007S).

In some embodiments, the adenosine deaminase comprises a mutation atglutamic acid1008 of the hADAR1-D amino acid sequence, or acorresponding position in a homologous ADAR protein. In someembodiments, the glutamic acid residue at position 1008 is replaced by apolar amino acid residue having a relatively large side chain. In someembodiments, the glutamic acid residue at position 1008 is replaced by aglutamine residue (E1008Q). In some embodiments, the glutamic acidresidue at position 1008 is replaced by a histidine residue (E1008H). Insome embodiments, the glutamic acid residue at position 1008 is replacedby an arginine residue (E1008R). In some embodiments, the glutamic acidresidue at position 1008 is replaced by a lysine residue (E1008K). Insome embodiments, the glutamic acid residue at position 1008 is replacedby a nonpolar or small polar amino acid residue. In some embodiments,the glutamic acid residue at position 1008 is replaced by aphenylalanine residue (E1008F). In some embodiments, the glutamic acidresidue at position 1008 is replaced by a tryptophan residue (E1008W).In some embodiments, the glutamic acid residue at position 1008 isreplaced by a glycine residue (E1008G). In some embodiments, theglutamic acid residue at position 1008 is replaced by an isoleucineresidue (E1008I). In some embodiments, the glutamic acid residue atposition 1008 is replaced by a valine residue (E1008V). In someembodiments, the glutamic acid residue at position 1008 is replaced by aproline residue (E1008P). In some embodiments, the glutamic acid residueat position 1008 is replaced by a serine residue (E1008S). In otherembodiments, the glutamic acid residue at position 1008 is replaced byan asparagine residue (E1008N). In other embodiments, the glutamic acidresidue at position 1008 is replaced by an alanine residue (E1008A). Inother embodiments, the glutamic acid residue at position 1008 isreplaced by a Methionine residue (E1008M). In some embodiments, theglutamic acid residue at position 1008 is replaced by a leucine residue(E1008L).

In some embodiments, to improve editing efficiency, the adenosinedeaminase may comprise one or more of the mutations: E1007S, E1007A,E1007V, E1008Q, E1008R, E1008H, E1008M, E1008N, E1008K, based on aminoacid sequence positions of hADAR1-D, and mutations in a homologous ADARprotein corresponding to the above.

In some embodiments, to reduce editing efficiency, the adenosinedeaminase may comprise one or more of the mutations: E1007R, E1007K,E1007Y, E1007L, E1007T, E1008G, E1008I, E1008P, E1008V, E1008F, E1008W,E1008S, E1008N, E1008K, based on amino acid sequence positions ofhADAR1-D, and mutations in a homologous ADAR protein corresponding tothe above.

In some embodiments, the substrate editing preference, efficiency and/orselectivity of an adenosine deaminase is affected by amino acid residuesnear or in the active center of the enzyme. In some embodiments, theadenosine deaminase comprises a mutation at the glutamic acid 1008position in hADAR1-D sequence, or a corresponding position in ahomologous ADAR protein. In some embodiments, the mutation is E1008R, ora corresponding mutation in a homologous ADAR protein. In someembodiments, the E1008R mutant has an increased editing efficiency fortarget adenosine residue that has a mismatched G residue on the oppositestrand.

In some embodiments, the adenosine deaminase protein further comprisesor is connected to one or more double-stranded RNA (dsRNA) bindingmotifs (dsRBMs) or domains (dsRBDs) for recognizing and binding todouble-stranded nucleic acid substrates. In some embodiments, theinteraction between the adenosine deaminase and the double-strandedsubstrate is mediated by one or more additional protein factor(s),including a CRISPR/CAS protein factor. In some embodiments, theinteraction between the adenosine deaminase and the double-strandedsubstrate is further mediated by one or more nucleic acid component(s),including a guide RNA.

In certain example embodiments, directed evolution may be used to designmodified ADAR proteins capable of catalyzing additional reactionsbesides deamination of a adenine to a hypoxanthine.

Modified Adenosine Deaminase Having C to U Deamination Activity

In certain example embodiments, directed evolution may be used to designmodified ADAR proteins capable of catalyzing additional reactionsbesides deamination of an adenine to a hypoxanthine. For example, themodified ADAR protein may be capable of catalyzing deamination of acytidine to a uracil. While not bound by a particular theory, mutationsthat improve C to U activity may alter the shape of the binding pocketto be more amenable to the smaller cytidine base. In some cases, themodified ADAR comprise mutations on residues the catalytic core and/orresidues that contact the RNA target. Examples of mutations on residuesin the catalytic core include V351G and K350I., based on amino acidsequence positions of hADAR2-D, and mutations in a homologous ADARprotein corresponding to the above. Examples of mutations on residues onthe residues that contact with the RNA target include S486A and S495N,based on amino acid sequence positions of hADAR2-D, and mutations in ahomologous ADAR protein corresponding to the above.

In certain embodiments the adenosine deaminase is engineered to convertthe activity to cytidine deaminase. Such engineered adenosine deaminasemay also retain its adenosine deaminase activity, i.e., such mutatedadenosine deaminase may have both adenosine deaminase and cytidinedeaminase activities. Accordingly in some embodiments, the adenosinedeaminase comprises one or more mutations in positions selected fromE396, C451, V351, R455, T375, K376, S486, Q488, R510, K594, R348, G593,S397, H443, L444, Y445, F442, E438, T448, A353, V355, T339, P539, T339,P539, V525 I520, P462 and N579. In particular embodiments, the adenosinedeaminase comprises one or more mutations in a position selected fromV351, L444, V355, V525 and I520. In some embodiments, the adenosinedeaminase may comprise one or more of mutations at E488, V351, S486,T375, S370, P462, N597, based on amino acid sequence positions ofhADAR2-D, and mutations in a homologous ADAR protein corresponding tothe above.

In some embodiments, the adenosine deaminase may comprise one or more ofthe mutations: E488Q based on amino acid sequence positions of hADAR2-D,and mutations in a homologous ADAR protein corresponding to the above.In some embodiments, the adenosine deaminase may comprise one or more ofthe mutations: E488Q, V351G, based on amino acid sequence positions ofhADAR2-D, and mutations in a homologous ADAR protein corresponding tothe above. In some embodiments, the adenosine deaminase may comprise oneor more of the mutations: E488Q, V351G, S486A, based on amino acidsequence positions of hADAR2-D, and mutations in a homologous ADARprotein corresponding to the above. In some embodiments, the adenosinedeaminase may comprise one or more of the mutations: E488Q, V351G,S486A, T375S, based on amino acid sequence positions of hADAR2-D, andmutations in a homologous ADAR protein corresponding to the above. Insome embodiments, the adenosine deaminase may comprise one or more ofthe mutations: E488Q, V351G, S486A, T375S, S370C, based on amino acidsequence positions of hADAR2-D, and mutations in a homologous ADARprotein corresponding to the above. In some embodiments, the adenosinedeaminase may comprise one or more of the mutations: E488Q, V351G,S486A, T375S, S370C, P462A, based on amino acid sequence positions ofhADAR2-D, and mutations in a homologous ADAR protein corresponding tothe above. In some embodiments, the adenosine deaminase may comprise oneor more of the mutations: E488Q, V351G, S486A, T375S, S370C, P462A,N597I, based on amino acid sequence positions of hADAR2-D, and mutationsin a homologous ADAR protein corresponding to the above. In someembodiments, the adenosine deaminase may comprise one or more of themutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, basedon amino acid sequence positions of hADAR2-D, and mutations in ahomologous ADAR protein corresponding to the above. In some embodiments,the adenosine deaminase may comprise one or more of the mutations:E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, based onamino acid sequence positions of hADAR2-D, and mutations in a homologousADAR protein corresponding to the above. In some embodiments, theadenosine deaminase may comprise one or more of the mutations: E488Q,V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I, based onamino acid sequence positions of hADAR2-D, and mutations in a homologousADAR protein corresponding to the above. In some embodiments, theadenosine deaminase may comprise one or more of the mutations: E488Q,V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I, M383L,based on amino acid sequence positions of hADAR2-D, and mutations in ahomologous ADAR protein corresponding to the above. In some embodiments,the adenosine deaminase may comprise one or more of the mutations:E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I,M383L, D619G, based on amino acid sequence positions of hADAR2-D, andmutations in a homologous ADAR protein corresponding to the above. Insome embodiments, the adenosine deaminase may comprise one or more ofthe mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I,I398V, K350I, M383L, D619G, S582T, based on amino acid sequencepositions of hADAR2-D, and mutations in a homologous ADAR proteincorresponding to the above. In some embodiments, the adenosine deaminasemay comprise one or more of the mutations: E488Q, V351G, S486A, T375S,S370C, P462A, N597I, L332I, I398V, K350I, M383L, D619G, S582T, V440Ibased on amino acid sequence positions of hADAR2-D, and mutations in ahomologous ADAR protein corresponding to the above. In some embodiments,the adenosine deaminase may comprise one or more of the mutations:E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I,M383L, D619G, S582T, V440I, S495N based on amino acid sequence positionsof hADAR2-D, and mutations in a homologous ADAR protein corresponding tothe above. In some embodiments, the adenosine deaminase may comprise oneor more of the mutations: E488Q, V351G, S486A, T375S, S370C, P462A,N597I, L332I, I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418Ebased on amino acid sequence positions of hADAR2-D, and mutations in ahomologous ADAR protein corresponding to the above. In some embodiments,the adenosine deaminase may comprise one or more of the mutations:E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I,M383L, D619G, S582T, V440I, S495N, K418E, S661T based on amino acidsequence positions of hADAR2-D, and mutations in a homologous ADARprotein corresponding to the above. In some examples, provided hereinincludes a mutated adenosine deaminase e.g., an adenosine deaminasecomprising one or more mutations of E488Q, V351G, S486A, T375S, S370C,P462A, N597I, L332I, I398V, K350I, M383L, D619G, S582T, V440I, S495N,K418E, S661T (based on amino acid sequence positions of hADAR2-D, andmutations in a homologous ADAR protein corresponding to the above),fused with a dead CRISPR-Cas protein or CRISPR-Cas nickase. In aparticular example, provided herein includes a mutated adenosinedeaminase e.g., an adenosine deaminase comprising E488Q, V351G, S486A,T375S, S370C, P462A, N597I, L332I, I398V, K350I, M383L, D619G, S582T,V440I, S495N, K418E, and S661T (based on amino acid sequence positionsof hADAR2-D, and mutations in a homologous ADAR protein corresponding tothe above), fused with a dead CRISPR-Cas protein or a CRISPR-Casnickase.

In some embodiments, the modified adenosine deaminase having C-to-Udeamination activity comprises a mutation at any one or more ofpositions V351, T375, R455, and E488 of the hADAR2-D amino acidsequence, or a corresponding position in a homologous ADAR protein. Insome embodiments, the adenosine deaminase comprises mutation E488Q. Insome embodiments, the adenosine deaminase comprises one or more ofmutations selected from V351I, V351L, V351F, V351M, V351C, V351A, V351G,V351P, V351T, V351S, V351Y, V351W, V351Q, V351N, V351H, V351E, V351D,V351K, V351R, T375I, T375L, T375V, T375F, T375M, T375C, T375A, T375G,T375P, T375S, T375Y, T375W, T375Q, T375N, T375H, T375E, T375D, T375K,T375R, R455I, R455L, R455V, R455F, R455M, R455C, R455A, R455G, R455P,R455T, R455S, R455Y, R455W, R455Q, R455N, R455H, R455E, R455D, R455K. Insome embodiments, the adenosine deaminase comprises mutation E488Q, andfurther comprises one or more of mutations selected from V351I, V351L,V351F, V351M, V351C, V351A, V351G, V351P, V351T, V351S, V351Y, V351W,V351Q, V351N, V351H, V351E, V351D, V351K, V351R, T375I, T375L, T375V,T375F, T375M, T375C, T375A, T375G, T375P, T375S, T375Y, T375W, T375Q,T375N, T375H, T375E, T375D, T375K, T375R, R455I, R455L, R455V, R455F,R455M, R455C, R455A, R455G, R455P, R455T, R455S, R455Y, R455W, R455Q,R455N, R455H, R455E, R455D, R455K.

In some cases, the modified ADAR may further comprise one or moremutations that reduce off-target activities. In cases where modifiedADAR has C-to-U deamination activity, such mutations may reduce A to Ioff-target activity and increase C-to-U on-target deamination activity.In general, such mutations may be on residues that interact with the RNAtarget. Examples of such mutations include S375N, S375C, S375A, andN473I, based on amino acid sequence positions of hADAR2-D, and mutationsin a homologous ADAR protein corresponding to the above. In one example,the ADAR has S375N mutation. In one example, provided herein includes amutated adenosine deaminase e.g., an adenosine deaminase comprisingE488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I,M383L, D619G, S582T, V440I, S495N, K418E, S661T, and S375N (based onamino acid sequence positions of hADAR2-D, and mutations in a homologousADAR protein corresponding to the above), fused with a dead CRISPR-Casprotein or a CRISPR-Cas nickase.

In connection with the aforementioned modified ADAR protein havingC-to-U deamination activity, the invention described herein also relatesto a method for deaminating a C in a target RNA sequence of interest,comprising delivering to a target RNA or DNA an AD-functionalizedcomposition disclosed herein.

In certain example embodiments, the method for deaminating a C in atarget RNA sequence comprising delivering to said target RNA: (a) acatalytically inactive (dead) Cas; (b) a guide molecule which comprisesa guide sequence linked to a direct repeat sequence; and (c) a modifiedADAR protein having C-to-U deamination activity or catalytic domainthereof; wherein said modified ADAR protein or catalytic domain thereofis covalently or non-covalently linked to said dead Cas protein or saidguide molecule or is adapted to link thereto after delivery; whereinguide molecule forms a complex with said dead Cas protein and directssaid complex to bind said target RNA sequence of interest; wherein saidguide sequence is capable of hybridizing with a target sequencecomprising said C to form an RNA duplex; wherein, optionally, said guidesequence comprises a non-pairing A or U at a position corresponding tosaid C resulting in a mismatch in the RNA duplex formed; and whereinsaid modified ADAR protein or catalytic domain thereof deaminates said Cin said RNA duplex.

In connection with the aforementioned modified ADAR protein havingC-to-U deamination activity, the invention described herein furtherrelates to an engineered, non-naturally occurring system suitable fordeaminating a C in a target locus of interest, comprising: (a) a guidemolecule which comprises a guide sequence linked to a direct repeatsequence, or a nucleotide sequence encoding said guide molecule; (b) acatalytically inactive CRISPR-Cas protein, or a nucleotide sequenceencoding said catalytically inactive CRISPR-Cas protein; (c) a modifiedADAR protein having C-to-U deamination activity or catalytic domainthereof, or a nucleotide sequence encoding said modified ADAR protein orcatalytic domain thereof; wherein said modified ADAR protein orcatalytic domain thereof is covalently or non-covalently linked to saidCRISPR-Cas protein or said guide molecule or is adapted to link theretoafter delivery; wherein said guide sequence is capable of hybridizingwith a target RNA sequence comprising a C to form an RNA duplex;wherein, optionally, said guide sequence comprises a non-pairing A or Uat a position corresponding to said C resulting in a mismatch in the RNAduplex formed; wherein, optionally, the system is a vector systemcomprising one or more vectors comprising: (a) a first regulatoryelement operably linked to a nucleotide sequence encoding said guidemolecule which comprises said guide sequence, (b) a second regulatoryelement operably linked to a nucleotide sequence encoding saidcatalytically inactive CRISPR-Cas protein; and (c) a nucleotide sequenceencoding a modified ADAR protein having C-to-U deamination activity orcatalytic domain thereof which is under control of said first or secondregulatory element or operably linked to a third regulatory element;wherein, if said nucleotide sequence encoding a modified ADAR protein orcatalytic domain thereof is operably linked to a third regulatoryelement, said modified ADAR protein or catalytic domain thereof isadapted to link to said guide molecule or said CRISPR-Cas protein afterexpression; wherein components (a), (b) and (c) are located on the sameor different vectors of the system, optionally wherein said first,second, and/or third regulatory element is an inducible promoter.

In an embodiment of the invention, the substrate of the adenosinedeaminase is an RNA/DNA heteroduplex formed upon binding of the guidemolecule to its DNA target which then forms the CRISPR-Cas complex withthe CRISPR-Cas enzyme. The RNA/DNA or DNA/RNA heteroduplex is alsoreferred to herein as the “RNA/DNA hybrid”, “DNA/RNA hybrid” or“double-stranded substrate”.

According to the present invention, the substrate of the adenosinedeaminase is an RNA/DNAn RNA duplex formed upon binding of the guidemolecule to its DNA target which then forms the CRISPR-Cas complex withthe CRISPR-Cas enzyme. The substrate of the adenosine deaminase can alsobe an RNA/RNA duplex formed upon binding of the guide molecule to itsRNA target which then forms the CRISPR-Cas complex with the CRISPR-Casenzyme. The RNA/DNA or DNA/RNAn RNA duplex is also referred to herein asthe “RNA/DNA hybrid”, “DNA/RNA hybrid” or “double-stranded substrate”.The particular features of the guide molecule and CRISPR-Cas enzyme aredetailed below.

The term “editing selectivity” as used herein refers to the fraction ofall sites on a double-stranded substrate that is edited by an adenosinedeaminase. Without being bound by theory, it is contemplated thatediting selectivity of an adenosine deaminase is affected by thedouble-stranded substrate's length and secondary structures, such as thepresence of mismatched bases, bulges and/or internal loops.

In some embodiments, when the substrate is a perfectly base-pairedduplex longer than 50 bp, the adenosine deaminase may be able todeaminate multiple adenosine residues within the duplex (e.g., 50% ofall adenosine residues). In some embodiments, when the substrate isshorter than 50 bp, the editing selectivity of an adenosine deaminase isaffected by the presence of a mismatch at the target adenosine site.Particularly, in some embodiments, adenosine (A) residue having amismatched cytidine (C) residue on the opposite strand is deaminatedwith high efficiency. In some embodiments, adenosine (A) residue havinga mismatched guanosine (G) residue on the opposite strand is skippedwithout editing.

In particular embodiments, the adenosine deaminase protein or catalyticdomain thereof is delivered to the cell or expressed within the cell asa separate protein, but is modified so as to be able to link to eitherthe Cas protein or the guide molecule. In particular embodiments, thisis ensured by the use of orthogonal RNA-binding protein or adaptorprotein/aptamer combinations that exist within the diversity ofbacteriophage coat proteins. Examples of such coat proteins include butare not limited to: MS2, Qβ, F2, GA, fr, JP501, M12, R17, BZ13, JP34,JP500, KU1, M11, MX1, TW18, VK, SP, FI, ID2, NL95, TW19, AP205, ϕCb5,ϕCb8r, ϕCb12r, ϕCb23r, 7s and PRR1. Aptamers can be naturally occurringor synthetic oligonucleotides that have been engineered through repeatedrounds of in vitro selection or SELEX (systematic evolution of ligandsby exponential enrichment) to bind to a specific target.

In particular embodiments, the guide molecule is provided with one ormore distinct RNA loop(s) or distinct sequence(s) that can recruit anadaptor protein. A guide molecule may be extended, without collidingwith the Cas protein by the insertion of distinct RNA loop(s) ordistinct sequence(s) that may recruit adaptor proteins that can bind tothe distinct RNA loop(s) or distinct sequence(s). Examples of modifiedguides and their use in recruiting effector domains to the Cas complexare provided in Konermann (Nature 2015, 517(7536): 583-588). Inparticular embodiments, the aptamer is a minimal hairpin aptamer whichselectively binds dimerized MS2 bacteriophage coat proteins in mammaliancells and is introduced into the guide molecule, such as in the stemloopand/or in a tetraloop. In these embodiments, the adenosine deaminaseprotein is fused to MS2. The adenosine deaminase protein is thenco-delivered together with the Cas protein and corresponding guide RNA.

In some embodiments, the Cas-ADAR base editing system described hereincomprises (a) a Cas protein, which is catalytically inactive or anickase; (b) a guide molecule which comprises a guide sequence; and (c)an adenosine deaminase protein or catalytic domain thereof; wherein theadenosine deaminase protein or catalytic domain thereof is covalently ornon-covalently linked to the Cas protein or the guide molecule or isadapted to link thereto after delivery; wherein the guide sequence issubstantially complementary to the target sequence but comprises anon-pairing C corresponding to the A being targeted for deamination,resulting in a A-C mismatch in a DNA-RNA or RNA-RNA duplex formed by theguide sequence and the target sequence. For application in eukaryoticcells, the Cas protein and/or the adenosine deaminase are preferablyNLS-tagged.

In some embodiments, the components (a), (b) and (c) are delivered tothe cell as a ribonucleoprotein complex. The ribonucleoprotein complexcan be delivered via one or more lipid nanoparticles.

In some embodiments, the components (a), (b) and (c) are delivered tothe cell as one or more RNA molecules, such as one or more guide RNAsand one or more mRNA molecules encoding the Cas protein, the adenosinedeaminase protein, and optionally the adaptor protein. The RNA moleculescan be delivered via one or more lipid nanoparticles.

In some embodiments, the components (a), (b) and (c) are delivered tothe cell as one or more DNA molecules. In some embodiments, the one ormore DNA molecules are comprised within one or more vectors such asviral vectors (e.g., AAV). In some embodiments, the one or more DNAmolecules comprise one or more regulatory elements operably configuredto express the Cas protein, the guide molecule, and the adenosinedeaminase protein or catalytic domain thereof, optionally wherein theone or more regulatory elements comprise inducible promoters.

In some embodiments of the guide molecule is capable of hybridizing witha target sequence comprising the Adenine to be deaminated within a firstDNA strand or a RNA strand at the target locus to form a DNA-RNA orRNA-RNA duplex which comprises a non-pairing Cytosine opposite to saidAdenine. Upon duplex formation, the guide molecule forms a complex withthe Cas protein and directs the complex to bind said first DNA strand orsaid RNA strand at the target locus of interest. Details on the aspectof the guide of the Cas-ADAR base editing system are provided hereinbelow.

In some embodiments, a Cas guide RNA having a canonical length (e.g.,about 20 nt for AacCas) is used to form a DNA-RNA or RNA-RNA duplex withthe target DNA or RNA. In some embodiments, a Cas guide molecule longerthan the canonical length (e.g., >20 nt for AacCas) is used to form aDNA-RNA or RNA-RNA duplex with the target DNA or RNA including outsideof the Cas-guide RNA-target DNA complex. In certain example embodiments,the guide sequence has a length of about 29-53 nt capable of forming aDNA-RNA or RNA-RNA duplex with said target sequence. In certain otherexample embodiments, the guide sequence has a length of about 40-50 ntcapable of forming a DNA-RNA or RNA-RNA duplex with said targetsequence. In certain example embodiments, the distance between saidnon-pairing C and the 5′ end of said guide sequence is 20-30nucleotides. In certain example embodiments, the distance between saidnon-pairing C and the 3′ end of said guide sequence is 20-30nucleotides.

In at least a first design, the Cas-ADAR system comprises (a) anadenosine deaminase fused or linked to a Cas protein, wherein the Casprotein is catalytically inactive or a nickase, and (b) a guide moleculecomprising a guide sequence designed to introduce a A-C mismatch in aDNA-RNA or RNA-RNA duplex formed between the guide sequence and thetarget sequence. In some embodiments, the Cas protein and/or theadenosine deaminase are NLS-tagged, on either the N- or C-terminus orboth.

In at least a second design, the Cas-ADAR system comprises (a) a Casprotein that is catalytically inactive or a nickase, (b) a guidemolecule comprising a guide sequence designed to introduce a A-Cmismatch in a DNA-RNA or RNA-RNA duplex formed between the guidesequence and the target sequence, and an aptamer sequence (e.g., MS2 RNAmotif or PP7 RNA motif) capable of binding to an adaptor protein (e.g.,MS2 coating protein or PP7 coat protein), and (c) an adenosine deaminasefused or linked to an adaptor protein, wherein the binding of theaptamer and the adaptor protein recruits the adenosine deaminase to theDNA-RNA or RNA-RNA duplex formed between the guide sequence and thetarget sequence for targeted deamination at the A of the A-C mismatch.In some embodiments, the adaptor protein and/or the adenosine deaminaseare NLS-tagged, on either the N- or C-terminus or both. The Cas proteincan also be NLS-tagged.

The use of different aptamers and corresponding adaptor proteins alsoallows orthogonal gene editing to be implemented. In one example inwhich adenosine deaminase are used in combination with cytidinedeaminase for orthogonal gene editing/deamination, sgRNA targetingdifferent loci are modified with distinct RNA loops in order to recruitMS2-adenosine deaminase and PP7-cytidine deaminase (or PP7-adenosinedeaminase and MS2-cytidine deaminase), respectively, resulting inorthogonal deamination of A or C at the target loci of interested,respectively. PP7 is the RNA-binding coat protein of the bacteriophagePseudomonas. Like MS2, it binds a specific RNA sequence and secondarystructure. The PP7 RNA-recognition motif is distinct from that of MS2.Consequently, PP7 and MS2 can be multiplexed to mediate distinct effectsat different genomic loci simultaneously. For example, an sgRNAtargeting locus A can be modified with MS2 loops, recruitingMS2-adenosine deaminase, while another sgRNA targeting locus B can bemodified with PP7 loops, recruiting PP7-cytidine deaminase. In the samecell, orthogonal, locus-specific modifications are thus realized. Thisprinciple can be extended to incorporate other orthogonal RNA-bindingproteins.

In at least a third design, the Cas-ADAR CRISPR system comprises (a) anadenosine deaminase inserted into an internal loop or unstructuredregion of a Cas protein, wherein the Cas protein is catalyticallyinactive or a nickase, and (b) a guide molecule comprising a guidesequence designed to introduce a A-C mismatch in a DNA-RNA or RNA-RNAduplex formed between the guide sequence and the target sequence.

Cas protein split sites that are suitable for insertion of adenosinedeaminase can be identified with the help of a crystal structure. Forexample, with respect to AacCas mutants, it should be readily apparentwhat the corresponding position for, for example, a sequence alignment.For other Cas protein one can use the crystal structure of an orthologif a relatively high degree of homology exists between the ortholog andthe intended Cas protein.

The split position may be located within a region or loop. Preferably,the split position occurs where an interruption of the amino acidsequence does not result in the partial or full destruction of astructural feature (e.g. alpha-helixes or (3-sheets). Unstructuredregions (regions that did not show up in the crystal structure becausethese regions are not structured enough to be “frozen” in a crystal) areoften preferred options. Splits in all unstructured regions that areexposed on the surface of Cas are envisioned in the practice of theinvention. The positions within the unstructured regions or outsideloops may not need to be exactly the numbers provided above, but mayvary by, for example 1, 2, 3, 4, 5, 6, 7, 8, 9, or even 10 amino acidseither side of the position given above, depending on the size of theloop, so long as the split position still falls within an unstructuredregion of outside loop.

The Cas-ADAR system described herein can be used to target a specificAdenine within a DNA sequence for deamination. For example, the guidemolecule can form a complex with the Cas protein and directs the complexto bind a target sequence at the target locus of interest. Because theguide sequence is designed to have a non-pairing C, the heteroduplexformed between the guide sequence and the target sequence comprises aA-C mismatch, which directs the adenosine deaminase to contact anddeaminate the A opposite to the non-pairing C, converting it to aInosine (I). Since Inosine (I) base pairs with C and functions like Gincellular process, the targeted deamination of A described herein areuseful for correction of undesirable G-A and C-T mutations, as well asfor obtaining desirable A-G and T-C mutations. In some embodiments, theguide may comprise one or more mismatches to increase specificity. Forexample, the guide may comprise one or more disfavorable guaninemismatches across from off-target adenosines.

Base Excision Repair Inhibitor

In some embodiments, the AD-functionalized CRISPR system furthercomprises a base excision repair (BER) inhibitor. Without wishing to bebound by any particular theory, cellular DNA-repair response to thepresence of I:T pairing may be responsible for a decrease in nucleobaseediting efficiency in cells. Alkyladenine DNA glycosylase (also known asDNA-3-methyladenine glycosylase, 3-alkyladenine DNA glycosylase, orN-methylpurine DNA glycosylase) catalyzes removal of hypoxanthine fromDNA in cells, which may initiate base excision repair, with reversion ofthe I:T pair to a A:T pair as outcome.

In some embodiments, the BER inhibitor is an inhibitor of alkyladenineDNA glycosylase. In some embodiments, the BER inhibitor is an inhibitorof human alkyladenine DNA glycosylase. In some embodiments, the BERinhibitor is a polypeptide inhibitor. In some embodiments, the BERinhibitor is a protein that binds hypoxanthine. In some embodiments, theBER inhibitor is a protein that binds hypoxanthine in DNA. In someembodiments, the BER inhibitor is a catalytically inactive alkyladenineDNA glycosylase protein or binding domain thereof. In some embodiments,the BER inhibitor is a catalytically inactive alkyladenine DNAglycosylase protein or binding domain thereof that does not excisehypoxanthine from the DNA. Other proteins that are capable of inhibiting(e.g., sterically blocking) an alkyladenine DNA glycosylasebase-excision repair enzyme are within the scope of this disclosure.Additionally, any proteins that block or inhibit base-excision repair asalso within the scope of this disclosure.

Without wishing to be bound by any particular theory, base excisionrepair may be inhibited by molecules that bind the edited strand, blockthe edited base, inhibit alkyladenine DNA glycosylase, inhibit baseexcision repair, protect the edited base, and/or promote fixing of thenon-edited strand. It is believed that the use of the BER inhibitordescribed herein can increase the editing efficiency of an adenosinedeaminase that is capable of catalyzing a A to I change.

Accordingly, in the first design of the AD-functionalized CRISPR systemdiscussed above, the CRISPR-Cas protein or the adenosine deaminase canbe fused to or linked to a BER inhibitor (e.g., an inhibitor ofalkyladenine DNA glycosylase). In some embodiments, the BER inhibitorcan be comprised in one of the following structures (nCas=Cas nickase;dCas=dead Cas): [AD]-[optional linker]-[nCas/dCas]-[optionallinker]-[BER inhibitor]; [AD]-[optional linker]-[BERinhibitor]-[optional linker]-[nCas/dCas]; [BER inhibitor]-[optionallinker]-[AD]-[optional linker]-[nCas/dCas]; [BER inhibitor]-[optionallinker]-[nCas/dCas]-[optional linker]-[AD]; [nCas/dCas]-[optionallinker]-[AD]-[optional linker]-[BER inhibitor]; [nCas/dCas]-[optionallinker]-[BER inhibitor]-[optional linker]-[AD].

Similarly, in the second design of the AD-functionalized CRISPR systemdiscussed above, the CRISPR-Cas protein, the adenosine deaminase, or theadaptor protein can be fused to or linked to a BER inhibitor (e.g., aninhibitor of alkyladenine DNA glycosylase). In some embodiments, the BERinhibitor can be comprised in one of the following structures (nCas=Casnickase; dCas=dead Cas): [nCas/dCas]-[optional linker]-[BER inhibitor];[BER inhibitor]-[optional linker]-[nCas/dCas]; [AD]-[optionallinker]-[Adaptor]-[optional linker]-[BER inhibitor]; [AD]-[optionallinker]-[BER inhibitor]-[optional linker]-[Adaptor]; [BERinhibitor]-[optional linker]-[AD]-[optional linker]-[Adaptor]; [BERinhibitor]-[optional linker]-[Adaptor]-[optional linker]-[AD];[Adaptor]-[optional linker]-[AD]-[optional linker]-[BER inhibitor];[Adaptor]-[optional linker]-[BER inhibitor]-[optional linker]-[AD].

In the third design of the AD-functionalized CRISPR system discussedabove, the BER inhibitor can be inserted into an internal loop orunstructured region of a CRISPR-Cas protein.

Cytidine Deaminase

In some embodiments, the deaminase is a cytidine deaminase. The term“cytidine deaminase” or “cytidine deaminase protein” or “cytidinedeaminase activity” as used herein refers to a protein, a polypeptide,or one or more functional domain(s) of a protein or a polypeptide thatis capable of catalyzing a hydrolytic deamination reaction that convertsan cytosine (or an cytosine moiety of a molecule) to an uracil (or auracil moiety of a molecule), as shown below. In some embodiments, thecytosine-containing molecule is an cytidine (C), and theuracil-containing molecule is an uridine (U). The cytosine-containingmolecule can be deoxyribonucleic acid (DNA) or ribonucleic acid (RNA).In certain examples, a cytidine deaminase may be a cytidine deaminaseacting on RNA (CDAR).

According to the present disclosure, cytidine deaminases that can beused in connection with the present disclosure include, but are notlimited to, members of the enzyme family known as apolipoprotein BmRNA-editing complex (APOBEC) family deaminase, an activation-induceddeaminase (AID), or a cytidine deaminase 1 (CDA1). In particularembodiments, the deaminase in an APOBEC1 deaminase, an APOBEC2deaminase, an APOBEC3A deaminase, an APOBEC3B deaminase, an APOBEC3Cdeaminase, and APOBEC3D deaminase, an APOBEC3E deaminase, an APOBEC3Fdeaminase an APOBEC3G deaminase, an APOBEC3H deaminase, or an APOBEC4deaminase.

In the methods and systems of the present invention, the cytidinedeaminase or engineered adenosine deaminase with cytidine deaminaseactivity is capable of targeting Cytosine in a DNA single strand. Incertain example embodiments the cytidine deaminase activity may edit ona single strand present outside of the binding component e.g. boundCRISPR-Cas. In other example embodiments, the cytidine deaminase mayedit at a localized bubble, such as a localized bubble formed by amismatch at the target edit site but the guide sequence. In certainexample embodiments the cytidine deaminase may contain mutations thathelp focus the area of activity such as those disclosed in Kim et al.,Nature Biotechnology (2017) 35(4):371-377 (doi:10.1038/nbt.3803.

In some embodiments, the cytidine deaminase is derived from one or moremetazoa species, including but not limited to, mammals, birds, frogs,squids, fish, flies and worms. In some embodiments, the cytidinedeaminase is a human, primate, cow, dog rat or mouse cytidine deaminase.

In some embodiments, the cytidine deaminase is a human APOBEC, includinghAPOBEC1 or hAPOBEC3. In some embodiments, the cytidine deaminase is ahuman AID.

In some embodiments, the cytidine deaminase protein recognizes andconverts one or more target cytosine residue(s) in a single-strandedbubble of a RNA duplex into uracil residues (s). In some embodiments,the cytidine deaminase protein recognizes a binding window on thesingle-stranded bubble of a RNA duplex. In some embodiments, the bindingwindow contains at least one target cytosine residue(s). In someembodiments, the binding window is in the range of about 3 bp to about100 bp. In some embodiments, the binding window is in the range of about5 bp to about 50 bp. In some embodiments, the binding window is in therange of about 10 bp to about 30 bp. In some embodiments, the bindingwindow is about 1 bp, 2 bp, 3 bp, 5 bp, 7 bp, 10 bp, 15 bp, 20 bp, 25bp, 30 bp, 40 bp, 45 bp, 50 bp, 55 bp, 60 bp, 65 bp, 70 bp, 75 bp, 80bp, 85 bp, 90 bp, 95 bp, or 100 bp.

In some embodiments, the cytidine deaminase protein comprises one ormore deaminase domains. Not intended to be bound by theory, it iscontemplated that the deaminase domain functions to recognize andconvert one or more target cytosine (C) residue(s) contained in asingle-stranded bubble of a RNA duplex into (an) uracil (U) residue (s).In some embodiments, the deaminase domain comprises an active center. Insome embodiments, the active center comprises a zinc ion. In someembodiments, amino acid residues in or near the active center interactwith one or more nucleotide(s) 5′ to a target cytosine residue. In someembodiments, amino acid residues in or near the active center interactwith one or more nucleotide(s) 3′ to a target cytosine residue.

In some embodiments, the cytidine deaminase comprises human APOBEC1 fullprotein (hAPOBEC1) or the deaminase domain thereof (hAPOBEC1-D) or aC-terminally truncated version thereof (hAPOBEC-T). In some embodiments,the cytidine deaminase is an APOBEC family member that is homologous tohAPOBEC1, hAPOBEC-D or hAPOBEC-T.

In some embodiments, the cytidine deaminase comprises human AID1 fullprotein (hAID) or the deaminase domain thereof (hAID-D) or aC-terminally truncated version thereof (hAID-T). In some embodiments,the cytidine deaminase is an AID family member that is homologous tohAID, hAID-D or hAID-T. In some embodiments, the hAID-T is a hAID whichis C-terminally truncated by about 20 amino acids.

In some embodiments, the cytidine deaminase comprises the wild-typeamino acid sequence of a cytosine deaminase. In some embodiments, thecytidine deaminase comprises one or more mutations in the cytosinedeaminase sequence, such that the editing efficiency, and/or substrateediting preference of the cytosine deaminase is changed according tospecific needs.

Certain mutations of APOBEC1 and APOBEC3 proteins have been described inKim et al., Nature Biotechnology (2017) 35(4):371-377(doi:10.1038/nbt.3803); and Harris et al. Mol. Cell (2002) 10:1247-1253,each of which is incorporated herein by reference in its entirety.

In some embodiments, the cytidine deaminase is an APOBEC1 deaminasecomprising one or more mutations at amino acid positions correspondingto W90, R118, H121, H122, R126, or R132 in rat APOBEC1, or an APOBEC3Gdeaminase comprising one or more mutations at amino acid positionscorresponding to W285, R313, D316, D317X, R320, or R326 in humanAPOBEC3G.

In some embodiments, the cytidine deaminase comprises a mutation attryptophane90 of the rat APOBEC1 amino acid sequence, or a correspondingposition in a homologous APOBEC protein, such as tryptophane285 ofAPOBEC3G. In some embodiments, the tryptophan residue at position 90 isreplaced by an tyrosine or phenylalanine residue (W90Y or W90F).

In some embodiments, the cytidine deaminase comprises a mutation atArginine118 of the rat APOBEC1 amino acid sequence, or a correspondingposition in a homologous APOBEC protein. In some embodiments, thearginine residue at position 118 is replaced by an alanine residue(R118A).

In some embodiments, the cytidine deaminase comprises a mutation atHistidine121 of the rat APOBEC1 amino acid sequence, or a correspondingposition in a homologous APOBEC protein. In some embodiments, thehistidine residue at position 121 is replaced by an arginine residue(H121R).

In some embodiments, the cytidine deaminase comprises a mutation atHistidine122 of the rat APOBEC1 amino acid sequence, or a correspondingposition in a homologous APOBEC protein. In some embodiments, thehistidine residue at position 122 is replaced by an arginine residue(H122R).

In some embodiments, the cytidine deaminase comprises a mutation atArginine126 of the rat APOBEC1 amino acid sequence, or a correspondingposition in a homologous APOBEC protein, such as Arginine320 ofAPOBEC3G. In some embodiments, the arginine residue at position 126 isreplaced by an alanine residue (R126A) or by a glutamic acid (R126E).

In some embodiments, the cytidine deaminase comprises a mutation atarginine132 of the APOBEC1 amino acid sequence, or a correspondingposition in a homologous APOBEC protein. In some embodiments, thearginine residue at position 132 is replaced by a glutamic acid residue(R132E).

In some embodiments, to narrow the width of the editing window, thecytidine deaminase may comprise one or more of the mutations: W90Y,W90F, R126E and R132E, based on amino acid sequence positions of ratAPOBEC1, and mutations in a homologous APOBEC protein corresponding tothe above.

In some embodiments, to reduce editing efficiency, the cytidinedeaminase may comprise one or more of the mutations: W90A, R118A, R132E,based on amino acid sequence positions of rat APOBEC1, and mutations ina homologous APOBEC protein corresponding to the above. In particularembodiments, it can be of interest to use a cytidine deaminase enzymewith reduced efficacy to reduce off-target effects.

In some embodiments, the cytidine deaminase is wild-type rat APOBEC1(rAPOBEC1, or a catalytic domain thereof. In some embodiments, thecytidine deaminase comprises one or more mutations in the rAPOBEC1sequence, such that the editing efficiency, and/or substrate editingpreference of rAPOBEC1 is changed according to specific needs.

rAPOBEC1: (SEQ ID NO: 243)MSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEINWGGRHSIWRHTSQNTNKHVEVNFIEKFTTERYFCPNTRCSITWFLSWSPCGECSRAITEFLSRYPHVTLFIYIARLYHHADPRNRQGLRDLISSGVTIQIMTEQESGYCWRNFVNYSPSNEAHWPRYPHLWVRLYVLELYCIILGLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPHILWATGLK

In some embodiments, the cytidine deaminase is wild-type human APOBEC1(hAPOBEC1) or a catalytic domain thereof. In some embodiments, thecytidine deaminase comprises one or more mutations in the hAPOBEC1sequence, such that the editing efficiency, and/or substrate editingpreference of hAPOBEC1 is changed according to specific needs.

APOBEC1: (SEQ ID NO: 244)MTSEKGPSTGDPTLRRRIEPWEFDVFYDPRELRKEACLLYEIKWGMSRKIWRSSGKNTTNHVEVNFIKKFTSERDFHPSMSCSITWFLSWSPCWECSQAIREFLSRHPGVTLVIYVARLFWHMDQQNRQGLRDLVNSGVTIQIMRASEYYHCWRNFVNYPPGDEAHWPQYPPLWMMLYALELHCIILSLPPCLKISRRWQNHLTFFRLHLQNCHYQTIPPHILLATGLIHPSVAWR

In some embodiments, the cytidine deaminase is wild-type human APOBEC3G(hAPOBEC3G) or a catalytic domain thereof. In some embodiments, thecytidine deaminase comprises one or more mutations in the hAPOBEC3Gsequence, such that the editing efficiency, and/or substrate editingpreference of hAPOBEC3G is changed according to specific needs.

hAPOBEC3G: (SEQ ID NO: 245)MELKYHPEMRFFHWFSKWRKLHRDQEYEVTWYISWSPCTKCTRDMATFLAEDPKVTLTIFVARLYYFWDPDYQEALRSLCQKRDGPRATMKIMNYDEFQHCWSKFVYSQRELEEPWNNLPKYYILLHIMLGEILRHSMDPPTFTENENNEPWVRGRHETYLCYEVERMHNDTWVLLNQRRGELCNQAPHKHGELEGRHAELCFLDVIPFWKLDLDQDYRVTCFTSWSPCFSCAQEMAKFISKNKHVSLCIFTARIYDDQGRCQEGLRTLAEAGAKISIMTYSEFKHCWDTFVDHQGCPFQPWDGLDEHSQDLSGRLRAILQNQEN

In some embodiments, the cytidine deaminase is wild-type Petromyzonmarinus CDA1 (pmCDA1) or a catalytic domain thereof. In someembodiments, the cytidine deaminase comprises one or more mutations inthe pmCDA1 sequence, such that the editing efficiency, and/or substrateediting preference of pmCDA1 is changed according to specific needs.

pmCDA1: (SEQ ID NO: 246)MTDAEYVRIHEKLDIYTFKKQFFNNKKSVSHRCYVLFELKRRGERRACFWGYAVNKPQSGTERGIHAEIFSIRKVEEYLRDNPGQFTINWYSSWSPCADCAEKILEWYNQELRGNGHTLKIWACKLYYEKNARNQIGLWNLRDNGVGLNVMVSEHYQCCRKIFIQSSHNQLNENRWLEKTLKRAEKRRSELSIMIQVKIL HTTKSPAV

In some embodiments, the cytidine deaminase is wild-type human AID(hAID) or a catalytic domain thereof. In some embodiments, the cytidinedeaminase comprises one or more mutations in the pmCDA1 sequence, suchthat the editing efficiency, and/or substrate editing preference ofpmCDA1 is changed according to specific needs.

hAID: (SEQ ID NO: 247)MDSLLMNRRKFLYQFKNVRWAKGRRETYLCYVVKRRDSATSFSLDFGYLRNKNGCHVELLFLRYISDWDLDPGRCYRVTWFTSWSPCYDCARHVADFLRGNPYLSLRIFTARLYFCEDRKAEPEGLRRLHRAGVQIAIMTFKDYFYCWNTFVENHERTFKAWEGLHENSVRLSRQLRRILLPLYEVDDLRDAFRTLGLLD

In some embodiments, the cytidine deaminase is truncated version of hAID(hAID-DC) or a catalytic domain thereof. In some embodiments, thecytidine deaminase comprises one or more mutations in the hAID-DCsequence, such that the editing efficiency, and/or substrate editingpreference of hAID-DC is changed according to specific needs.

hAID-DC: (SEQ ID NO: 248)MDSLLMNRRKFLYQFKNVRWAKGRRETYLCYVVKRRDSATSFSLDFGYLRNKNGCHVELLFLRYISDWDLDPGRCYRVTWFTSWSPCYDCARHVADFLRGNPNLSLRIFTARLYFCEDRKAEPEGLRRLHRAGVQIAIMTFKDYFYCWNTFVENHERTFKAWEGLHENSVRLSRQLRRILL

Additional embodiments of the cytidine deaminase are disclosed in WOWO2017/070632, titled “Nucleobase Editor and Uses Thereof,” which isincorporated herein by reference in its entirety.

In some embodiments, the cytidine deaminase has an efficient deaminationwindow that encloses the nucleotides susceptible to deamination editing.Accordingly, in some embodiments, the “editing window width” refers tothe number of nucleotide positions at a given target site for whichediting efficiency of the cytidine deaminase exceeds the half-maximalvalue for that target site. In some embodiments, the cytidine deaminasehas an editing window width in the range of about 1 to about 6nucleotides. In some embodiments, the editing window width of thecytidine deaminase is 1, 2, 3, 4, 5, or 6 nucleotides.

Not intended to be bound by theory, it is contemplated that in someembodiments, the length of the linker sequence affects the editingwindow width. In some embodiments, the editing window width increases(e.g., from about 3 to about 6 nucleotides) as the linker length extends(e.g., from about 3 to about 21 amino acids). In a non-limiting example,a 16-residue linker offers an efficient deamination window of about 5nucleotides. In some embodiments, the length of the guide RNA affectsthe editing window width. In some embodiments, shortening the guide RNAleads to a narrowed efficient deamination window of the cytidinedeaminase.

In some embodiments, mutations to the cytidine deaminase affect theediting window width. In some embodiments, the cytidine deaminasecomponent of the CD-functionalized CRISPR system comprises one or moremutations that reduce the catalytic efficiency of the cytidinedeaminase, such that the deaminase is prevented from deamination ofmultiple cytidines per DNA binding event. In some embodiments,tryptophan at residue 90 (W90) of APOBEC1 or a corresponding tryptophanresidue in a homologous sequence is mutated. In some embodiments, thecatalytically inactive CRISPR-Cas is fused to or linked to an APOBEC1mutant that comprises a W90Y or W90F mutation. In some embodiments,tryptophan at residue 285 (W285) of APOBEC3G, or a correspondingtryptophan residue in a homologous sequence is mutated. In someembodiments, the catalytically inactive CRISPR-Cas is fused to or linkedto an APOBEC3G mutant that comprises a W285Y or W285F mutation.

In some embodiments, the cytidine deaminase component ofCD-functionalized CRISPR system comprises one or more mutations thatreduce tolerance for non-optimal presentation of a cytidine to thedeaminase active site. In some embodiments, the cytidine deaminasecomprises one or more mutations that alter substrate binding activity ofthe deaminase active site. In some embodiments, the cytidine deaminasecomprises one or more mutations that alter the conformation of DNA to berecognized and bound by the deaminase active site. In some embodiments,the cytidine deaminase comprises one or more mutations that alter thesubstrate accessibility to the deaminase active site. In someembodiments, arginine at residue 126 (R126) of APOBEC1 or acorresponding arginine residue in a homologous sequence is mutated. Insome embodiments, the catalytically inactive CRISPR-Cas is fused to orlinked to an APOBEC1 that comprises a R126A or R126E mutation. In someembodiments, tryptophan at residue 320 (R320) of APOBEC3G, or acorresponding arginine residue in a homologous sequence is mutated. Insome embodiments, the catalytically inactive CRISPR-Cas is fused to orlinked to an APOBEC3G mutant that comprises a R320A or R320E mutation.In some embodiments, arginine at residue 132 (R132) of APOBEC1 or acorresponding arginine residue in a homologous sequence is mutated. Insome embodiments, the catalytically inactive CRISPR-Cas is fused to orlinked to an APOBEC1 mutant that comprises a R132E mutation.

In some embodiments, the APOBEC1 domain of the CD-functionalized CRISPRsystem comprises one, two, or three mutations selected from W90Y, W90F,R126A, R126E, and R132E. In some embodiments, the APOBEC1 domaincomprises double mutations of W90Y and R126E. In some embodiments, theAPOBEC1 domain comprises double mutations of W90Y and R132E. In someembodiments, the APOBEC1 domain comprises double mutations of R126E andR132E. In some embodiments, the APOBEC1 domain comprises three mutationsof W90Y, R126E and R132E.

In some embodiments, one or more mutations in the cytidine deaminase asdisclosed herein reduce the editing window width to about 2 nucleotides.In some embodiments, one or more mutations in the cytidine deaminase asdisclosed herein reduce the editing window width to about 1 nucleotide.In some embodiments, one or more mutations in the cytidine deaminase asdisclosed herein reduce the editing window width while only minimally ormodestly affecting the editing efficiency of the enzyme. In someembodiments, one or more mutations in the cytidine deaminase asdisclosed herein reduce the editing window width without reducing theediting efficiency of the enzyme. In some embodiments, one or moremutations in the cytidine deaminase as disclosed herein enablediscrimination of neighboring cytidine nucleotides, which would beotherwise edited with similar efficiency by the cytidine deaminase.

In some embodiments, the cytidine deaminase protein further comprises oris connected to one or more double-stranded RNA (dsRNA) binding motifs(dsRBMs) or domains (dsRBDs) for recognizing and binding todouble-stranded nucleic acid substrates. In some embodiments, theinteraction between the cytidine deaminase and the substrate is mediatedby one or more additional protein factor(s), including a CRISPR/CASprotein factor. In some embodiments, the interaction between thecytidine deaminase and the substrate is further mediated by one or morenucleic acid component(s), including a guide RNA.

According to the present invention, the substrate of the cytidinedeaminase is an DNA single strand bubble of a RNA duplex comprising aCytosine of interest, made accessible to the cytidine deaminase uponbinding of the guide molecule to its DNA target which then forms theCRISPR-Cas complex with the CRISPR-Cas enzyme, whereby the cytosinedeaminase is fused to or is capable of binding to one or more componentsof the CRISPR-Cas complex, i.e. the CRISPR-Cas enzyme and/or the guidemolecule. The particular features of the guide molecule and CRISPR-Casenzyme are detailed below.

The cytidine deaminase or catalytic domain thereof may be a human, arat, or a lamprey cytidine deaminase protein or catalytic domainthereof.

The cytidine deaminase protein or catalytic domain thereof may be anapolipoprotein B mRNA-editing complex (APOBEC) family deaminase. Thecytidine deaminase protein or catalytic domain thereof may be anactivation-induced deaminase (AID). The cytidine deaminase protein orcatalytic domain thereof may be a cytidine deaminase 1 (CDA1).

The cytidine deaminase protein or catalytic domain thereof may be anAPOBEC1 deaminase. The APOBEC1 deaminase may comprise one or moremutations corresponding to W90A, W90Y, R118A, H121R, H122R, R126A,R126E, or R132E in rat APOBEC1, or an APOBEC3G deaminase comprising oneor more mutations corresponding to W285A, W285Y, R313A, D316R, D317R,R320A, R320E, or R326E in human APOBEC3G.

The system may further comprise a uracil glycosylase inhibitor (UGI).Inn some embodiments, the cytidine deaminase protein or catalytic domainthereof is delivered together with a uracil glycosylase inhibitor (UGI).The GI may be linked (e.g., covalently linked) to the cytidine deaminaseprotein or catalytic domain thereof and/or a catalytically inactiveCRISPR-Cas protein.

Regulation of Post-Translational Modification of Gene Products

In some cases, base editing may be used for regulatingpost-translational modification of a gene products. In some cases, anamino acid residue that is a post-translational modification site may bemutated by base editing to an amino residue that cannot be modified.Examples of such post-translational modifications include disulfide bondformation, glycosylation, lipidation, acetylation, phosphorylation,methylation, ubiquitination, sumoylation, or any combinations thereof.

In some embodiments, the base editors herein may regulate Stat3/IRF-5pathway, e.g., for reduction of inflammation. For example,phosphorylation on Tyr705 of Stat3, Thr10, Ser158, Ser309, Ser317,Ser451, and/or Ser462 of IRF-5 may be involved with interleukinsignaling. Base editors herein may be used to mutate one or more ofthese procreation sites for regulating immunity, autoimmunity, and/orinflammation.

In some embodiments, the base editors herein may regulate insulinreceptor substrate (IRS) pathway. For example, phosphorylation onSer265, Ser302, Ser325, Ser336, Ser358, Ser407, and/or Ser408 may beinvolved in regulating (e.g., inhibit) ISR pathway. Alternatively oradditionally, Serine 307 in mouse (or Serine 312 in human) may bemutated so the phosphorylation may be regulated. For example, Serine 307phosphorylation may lead to degradation of IRS-1 and reduce MAPKsignaling. Serine 307 phosphorylation may be induced under insulininsensitivity conditions, such as insulin overstimulation and/or TNFαtreatment. In some examples, 5307F mutation may be generated forstabilizing the interaction between IRS-1 and other components in thepathway. Base editors herein may be used to mutate one or more of theseprocreation sites for regulating IRS pathway.

Regulation of Stability of Gene Products

In some embodiments, base editing may be used for regulating thestability of gene products. For example, one or more amino acid residuesthat regulate protein degradation rates may be mutated by the baseeditors herein. In some cases, such amino acid residues may be in adegron. A degron may refer to a portion of a protein involved inregulating the degradation rate of the protein. Degrons may includeshort amino acid sequences, structural motifs, and exposed amino acids(e.g., lysine or arginine). Some protein may comprise multiple degrons.The degrons be ubiquitin-dependent (e.g., regulating protein degradationbased on ubiquitination of the protein) or ubiquitin-independent.

In some cases, the based editing may be used to mutate one or more aminoacid residues in a signal peptide for protein degradation. In someexamples, the signal peptide may be a PEST sequence, which is a peptidesequence that is rich in proline (P), glutamic acid (E), serine (S), andthreonine (T). For example, the stability of NANOG, which comprises aPEST sequence, may be increased, e.g., to promote embryonic stem cellpluripotency.

In some examples, the base editors may be used for mutating SMN2 (e.g.,to generate S270A mutilation) to increase stability of the SMN2 protein,which is involved in spinal muscular atrophy. Other mutations in SMN2that may be generated by based editors include those described in Cho S.et al., Genes Dev. 2010 Mar. 1; 24(5): 438-442. In certain examples, thebase editors may be used for generating mutations on IκBα, as describedin Fortmann K T et al., J Mol Biol. 2015 Aug. 28; 427(17): 2748-2756.Target sites in degrons may be identified by computational tools, e.g.,the online tools provided on slim.ucd.ie/apc/index.php. Other targetsinclude Cdc25A phosphatase.

Examples of Genes that can be Targeted by Base Editors

In some examples, the base editors may be used for modifying PCSK9. Thebase editors may introduce stop codons and/or disease-associatedmutations that reduce PCSK9 activity. The base editing may introduce oneor more of the following mutations in PCSK9: R46L, R46A, A53V, A53A,E57K, Y142X, L253F, R237W, H391N, N425S, A443T, I474V, I474A, Q554E,Q619P, E670G, E670A, C679X, H417Q, R469W, E482G, F515L, and/or H553R.

In some examples, the base editors may be used for modifying ApoE. Thebase editors may target ApoE in synthetic model and/or patient-derivedneurons (e.g., those derived from iPSC). The targeting may be tested bysequencing.

In some examples, the base editors may be used for modifying Stat1/3.The base editor may target Y705 and/or S727 for reducing Stat1/3activation. The base editing may be tested by luciferase-based promoter.Targeting Stat1/3 by base editing may block monocyte to macrophagedifferentiation, and inflammation in response to ox-LDL stimulation ofmacrophages.

In some examples, the base editors may be used for modifying TFEB(transcription factor for EB). The base editor may target one or moreamino acid residues that regulate translocation of the TFEB. In somecases, the base editor may target one or more amino acid residues thatregulate autophagy.

In some examples, the base editors may be used for modifying ornithinecarbamoyl transferase (OTC). Such modification may be used for correctornithine carbamoyl transferase deficiency. For example, base editingmay correct Leu45Pro mutation by converting nucleotide 134C to U. Anexample approach is shown in FIG. 102.

In some examples, the base editors may be used for modifying Lipin1. Thebase editor may target one or more serine's that can be phosphorylatedby mTOR. Base editing of Lipin1 may regulate lipid accumulation. Thebase editors may target Lipin1 in 3T3L1 preadipocyte model. Effects ofthe base editing may be tested by measuring reduction of lipidaccumulation (e.g., via oil red).

Base Editing Guide Molecule Design Considerations

In some embodiments, the guide sequence is an RNA sequence of between 10to 50 nt in length, but more particularly of about 20-30 ntadvantageously about 20 nt, 23-25 nt or 24 nt. In base editingembodiments, the guide sequence is selected so as to ensure that ithybridizes to the target sequence comprising the adenosine to bedeaminated. This is described more in detail below. Selection canencompass further steps which increase efficacy and specificity ofdeamination.

In some embodiments, the guide sequence is about 20 nt to about 30 ntlong and hybridizes to the target DNA strand to form an almost perfectlymatched duplex, except for having a dA-C mismatch at the targetadenosine site. Particularly, in some embodiments, the dA-C mismatch islocated close to the center of the target sequence (and thus the centerof the duplex upon hybridization of the guide sequence to the targetsequence), thereby restricting the adenosine deaminase to a narrowediting window (e.g., about 4 bp wide). In some embodiments, the targetsequence may comprise more than one target adenosine to be deaminated.In further embodiments the target sequence may further comprise one ormore dA-C mismatch 3′ to the target adenosine site. In some embodiments,to avoid off-target editing at an unintended Adenine site in the targetsequence, the guide sequence can be designed to comprise a non-pairingGuanine at a position corresponding to said unintended Adenine tointroduce a dA-G mismatch, which is catalytically unfavorable forcertain adenosine deaminases such as ADAR1 and ADAR2. See Wong et al.,RNA 7:846-858 (2001), which is incorporated herein by reference in itsentirety.

In some embodiments, a CRISPR-Cas guide sequence having a canonicallength (e.g., about 20 nt) is used to form a heteroduplex with thetarget DNA. In some embodiments, a CRISPR-Cas guide molecule longer thanthe canonical length (e.g., >20 nt) is used to form a heteroduplex withthe target DNA including outside of the CRISPR-Cas-guide RNA-target DNAcomplex. This can be of interest where deamination of more than oneadenine within a given stretch of nucleotides is of interest. Inalternative embodiments, it is of interest to maintain the limitation ofthe canonical guide sequence length. In some embodiments, the guidesequence is designed to introduce a dA-C mismatch outside of thecanonical length of CRISPR-Cas guide, which may decrease sterichindrance by CRISPR-Cas and increase the frequency of contact betweenthe adenosine deaminase and the dA-C mismatch.

In some base editing embodiments, the position of the mismatchednucleobase (e.g., cytidine) is calculated from where the PAM would be ona DNA target. In some embodiments, the mismatched nucleobase ispositioned 12-21 nt from the PAM, or 13-21 nt from the PAM, or 14-21 ntfrom the PAM, or 14-20 nt from the PAM, or 15-20 nt from the PAM, or16-20 nt from the PAM, or 14-19 nt from the PAM, or 15-19 nt from thePAM, or 16-19 nt from the PAM, or 17-19 nt from the PAM, or about 20 ntfrom the PAM, or about 19 nt from the PAM, or about 18 nt from the PAM,or about 17 nt from the PAM, or about 16 nt from the PAM, or about 15 ntfrom the PAM, or about 14 nt from the PAM. In a preferred embodiment,the mismatched nucleobase is positioned 17-19 nt or 18 nt from the PAM.

Mismatch distance is the number of bases between the 3′ end of theCRISPR-Cas spacer and the mismatched nucleobase (e.g., cytidine),wherein the mismatched base is included as part of the mismatch distancecalculation. In some embodiment, the mismatch distance is 1-10 nt, or1-9 nt, or 1-8 nt, or 2-8 nt, or 2-7 nt, or 2-6 nt, or 3-8 nt, or 3-7nt, or 3-6 nt, or 3-5 nt, or about 2 nt, or about 3 nt, or about 4 nt,or about 5 nt, or about 6 nt, or about 7 nt, or about 8 nt. In apreferred embodiment, the mismatch distance is 3-5 nt or 4 nt.

In some embodiment, the editing window of a CRISPR-Cas-ADAR systemdescribed herein is 12-21 nt from the PAM, or 13-21 nt from the PAM, or14-21 nt from the PAM, or 14-20 nt from the PAM, or 15-20 nt from thePAM, or 16-20 nt from the PAM, or 14-19 nt from the PAM, or 15-19 ntfrom the PAM, or 16-19 nt from the PAM, or 17-19 nt from the PAM, orabout 20 nt from the PAM, or about 19 nt from the PAM, or about 18 ntfrom the PAM, or about 17 nt from the PAM, or about 16 nt from the PAM,or about 15 nt from the PAM, or about 14 nt from the PAM. In someembodiment, the editing window of the CRISPR-Cas-ADAR system describedherein is 1-10 nt from the 3′ end of the CRISPR-Cas spacer, or 1-9 ntfrom the 3′ end of the CRISPR-Cas spacer, or 1-8 nt from the 3′ end ofthe CRISPR-Cas spacer, or 2-8 nt from the 3′ end of the CRISPR-Casspacer, or 2-7 nt from the 3′ end of the CRISPR-Cas spacer, or 2-6 ntfrom the 3′ end of the CRISPR-Cas spacer, or 3-8 nt from the 3′ end ofthe CRISPR-Cas spacer, or 3-7 nt from the 3′ end of the CRISPR-Casspacer, or 3-6 nt from the 3′ end of the CRISPR-Cas spacer, or 3-5 ntfrom the 3′ end of the CRISPR-Cas spacer, or about 2 nt from the 3′ endof the CRISPR-Cas spacer, or about 3 nt from the 3′ end of theCRISPR-Cas spacer, or about 4 nt from the 3′ end of the CRISPR-Casspacer, or about 5 nt from the 3′ end of the CRISPR-Cas spacer, or about6 nt from the 3′ end of the CRISPR-Cas spacer, or about 7 nt from the 3′end of the CRISPR-Cas spacer, or about 8 nt from the 3′ end of theCRISPR-Cas spacer.

Linkers

The deaminase herein may be fused to a Cas protein via a linker. It isfurther envisaged that RNA adenosine methylase (N(6)-methyladenosine)can be fused to the RNA targeting effector proteins of the invention andtargeted to a transcript of interest. This methylase causes reversiblemethylation, has regulatory roles and may affect gene expression andcell fate decisions by modulating multiple RNA-related cellular pathways(Fu et al Nat Rev Genet. 2014; 15(5):293-306).

ADAR or other RNA modification enzymes may be linked (e.g., fused) toCRISPR-Cas or a dead CRISPR-Cas protein via a linker, e.g., to the Cterminus or the N-terminus of CRISPR-Cas or dead CRISPR-Cas.

The term “linker” as used in reference to a fusion protein refers to amolecule which joins the proteins to form a fusion protein. Generally,such molecules have no specific biological activity other than to joinor to preserve some minimum distance or other spatial relationshipbetween the proteins. However, in certain embodiments, the linker may beselected to influence some property of the linker and/or the fusionprotein such as the folding, net charge, or hydrophobicity of thelinker.

Suitable linkers for use in the methods of the present invention arewell known to those of skill in the art and include, but are not limitedto, straight or branched-chain carbon linkers, heterocyclic carbonlinkers, or peptide linkers. However, as used herein the linker may alsobe a covalent bond (carbon-carbon bond or carbon-heteroatom bond). Inparticular embodiments, the linker is used to separate the CRISPR-Casprotein and the nucleotide deaminase by a distance sufficient to ensurethat each protein retains its required functional property. Preferredpeptide linker sequences adopt a flexible extended conformation and donot exhibit a propensity for developing an ordered secondary structure.In certain embodiments, the linker can be a chemical moiety which can bemonomeric, dimeric, multimeric or polymeric. Preferably, the linkercomprises amino acids. Typical amino acids in flexible linkers includeGly, Asn and Ser. Accordingly, in particular embodiments, the linkercomprises a combination of one or more of Gly, Asn and Ser amino acids.Other near neutral amino acids, such as Thr and Ala, also may be used inthe linker sequence. Exemplary linkers are disclosed in Maratea et al.(1985), Gene 40: 39-46; Murphy et al. (1986) Proc. Nat'l. Acad. Sci. USA83: 8258-62; U.S. Pat. Nos. 4,935,233; and 4,751,180. For example,GlySer linkers GGS, GGGS or GSG can be used. GGS, GSG, GGGS or GGGGSlinkers can be used in repeats of 3 (such as (GGS)₃ (SEQ ID No. 249),(GGGGS)₃ (SEQ ID NO:79)) or 5, 6, 7, 9 or even 12 (SEQ ID NO:250-254) ormore, to provide suitable lengths. In some cases, the linker may be(GGGGS)₃₋₁₅, For example, in some cases, the linker may be (GGGGS)₃₋₁₁,e.g., GGGGS (SEQ ID NO:255), (GGGGS)₂ (SEQ ID NO:256), (GGGGS)₃ (SEQ IDNO:79), (GGGGS)₄ (SEQ ID NO:257), (GGGGS)₅, (GGGGS)₆ (SEQ ID NO:251),(GGGGS)₇ (SEQ ID NO:252), (GGGGS)₈ (SEQ ID NO:258), (GGGGS)₉ (SEQ IDNO:253), (GGGGS)₁₀ (SEQ ID NO:259), or (GGGGS)₁₁ (SEQ ID NO:260).

In particular embodiments, linkers such as (GGGGS)₃ are preferably usedherein. (GGGGS)₆ (GGGGS)₉ or (GGGGS)₁₂ may preferably be used asalternatives. Other preferred alternatives are (GGGGS)₁ (SEQ ID No 255),(GGGGS)₂ (SEQ ID No. 256), (GGGGS)₄, (GGGGS)₅, (GGGGS)₇, (GGGGS)₈,(GGGGS)₁₀, or (GGGGS)₁₁. In yet a further embodiment,LEPGEKPYKCPECGKSFSQSGALTRHQRTHTR (SEQ ID No:261) is used as a linker. Inyet an additional embodiment, the linker is an XTEN linker. Inparticular embodiments, the CRISPR-cas protein is a CRISPR-Cas proteinand is linked to the deaminase protein or its catalytic domain by meansof an LEPGEKPYKCPECGKSFSQSGALTRHQRTHTR (SEQ ID No:261) linker. Infurther particular embodiments, the CRISPR-Cas protein is linkedC-terminally to the N-terminus of a deaminase protein or its catalyticdomain by means of an LEPGEKPYKCPECGKSFSQSGALTRHQRTHTR (SEQ ID No:261)linker. In addition, N- and C-terminal NLSs can also function as linker(e.g., PKKKRKVEASSPKKRKVEAS (SEQ ID No. 262)).

Examples of linkers are shown in the Table 8 below.

TABLE 8 GGS GGTGGTAGT (SEQ ID NO: 263) GGSx3 (9)GGTGGTAGTGGAGGGAGCGGCGGTTCA (SEQ ID NO: 264) GGSx7 (21)ggtggaggaggctctggtggaggcggtagcggaggcggagggtcgGGTGGTAGTGGAGGGAGCGGCGGTTCA (SEQ ID NO: 265) XTENTCGGGATCTGAGACGCCTGGGACCTCGGAATCGGCTAC GCCCGAAAGT (SEQ ID NO: 266) Z-Gtggataacaaatttaacaaagaaatgtgggcggcgtg EGFR_Shortggaagaaattcgtaacctgccgaacctgaacggctggcagatgaccgcgtttattgcgagcctggtggatgatccgagccagagcgcgaacctgctggcggaagcgaaaaaactgaacgatgcgcaggcgccgaaaaccggcggtggttctg gt (SEQ ID NO: 267) GSATGgtggttctgccggtggctccggttctggctccagcggtggcagctctggtgcgtccggcacgggtactgcgggtggcactggcagcggttccggtactggctctggc (SEQ ID  NO: 268)

A nucleotide deaminase or other RNA modification enzyme may be linked toCRISPR-Cas or a dead CRISPR-Cas via one or more amino acids. In somecases, the nucleotide deaminase may be linked to the CRISPR-Cas or adead CRISPR-Cas via one or more amino acids 411-429, 114-124, 197-241,and 607-624. The amino acid position may correspond to a CRISPR-Casortholog disclosed herein. In certain examples, the nucleotide deaminasemay be is linked to the dead CRISPR-Cas via one or more amino acidscorresponding to amino 411-429, 114-124, 197-241, and 607-624 ofPrevotella buccae CRISPR-Cas.

Methods of Use in General

In another aspect, the present disclosure discloses methods of using thecompositions and systems herein. In general, the methods includemodifying a target nucleic acid by introducing in a cell or organismthat comprises the target nucleic acid the engineered CRISPR-Casprotein, polynucleotide(s) encoding engineered CRISPR-Cas protein, theCRISPR-Cas system, or the vector or vector system comprising thepolynucleotide(s), such that the engineered CRISPR-Cas protein modifiesthe target nucleic acid in the cell or organism. The engineeredCRISPR-Cas protein or system may be introduced via delivery byliposomes, nanoparticles, exosomes, microvesicles, nucleic acidnanoassemblies, a gene gun, an implantable device, or the vector systemherein. The cell or organisms may be a eukaryotic cell or organism. Thecell or organisms is an animal cell or organism. The cell or organismsis a plant cell or organism. Examples of nucleic acid nanoassembliesinclude DNA origami and RNA origami, e.g., those described in U.S. Pat.No. 8,554,489, US20160103951, WO2017189914, and WO2017189870, which areincorporated by reference in their entireties. A gene gun may include abiolistic particle delivery system, which is a device for deliveringexogenous DNA (transgenes) to cells. The payload may be an elementalparticle of a heavy metal coated with DNA (typically plasmid DNA). Anexample of delivery components in CRISPR-Cas systems is described inSvitashev et al., Nat Commun. 2016; 7: 13274.

In some embodiments, the target nucleic acid comprises a genomic locus,and the engineered CRISPR-Cas protein modifies gene product encoded atthe genomic locus or expression of the gene product. The target nucleicacid is DNA or RNA and wherein one or more nucleotides in the targetnucleic acid may be base edited. The target nucleic acid may be DNA orRNA and wherein the target nucleic acid is cleaved. The engineeredCRISPR-Cas protein may further cleave non-target nucleic acid.

In some embodiments, the methods may further comprise visualizingactivity and, optionally, using a detectable label. The method may alsocomprise detecting binding of one or more components of the CRISPR-Cassystem to the target nucleic acid.

In another aspect the methods of use include detecting a target nucleicacid in a sample. In some embodiments, the methods include contacting asample with: an engineered CRISPR-Cas protein herein; at least one guidepolynucleotide comprising a guide sequence capable of binding to thetarget nucleic acid and designed to form a complex with the engineeredCRISPR-Cas; and a RNA-based masking construct comprising a non-targetsequence; wherein the engineered CRISPR-Cas protein exhibits collateralRNase activity and cleaves the non-target sequence of the detectionconstruct; and detecting a signal from cleavage of the non-targetsequence, thereby detecting the target nucleic acid in the sample. Themethods may further comprise contacting the sample with reagents foramplifying the target nucleic acid. The reagents for amplifying maycomprise isothermal amplification reaction reagents. The isothermalamplification reagents may comprise nucleic-acid sequence-basedamplification, recombinase polymerase amplification, loop-mediatedisothermal amplification, strand displacement amplification,helicase-dependent amplification, or nicking enzyme amplificationreagents. The target nucleic acid is DNA molecule and the method mayfurther comprise contacting the target DNA molecule with a primercomprising an RNA polymerase site and RNA polymerase.

The masking construct: suppresses generation of a detectable positivesignal until the masking construct cleaved or deactivated, or masks adetectable positive signal or generates a detectable negative signaluntil the masking construct cleaved or deactivated. The maskingconstruct may comprise: a silencing RNA that suppresses generation of agene product encoded by a reporting construct, wherein the gene productgenerates the detectable positive signal when expressed; a ribozyme thatgenerates the negative detectable signal, and wherein the positivedetectable signal is generated when the ribozyme is deactivated; or aribozyme that converts a substrate to a first color and wherein thesubstrate converts to a second color when the ribozyme is deactivated;an aptamer and/or comprises a polynucleotide-tethered inhibitor; apolynucleotide to which a detectable ligand and a masking component areattached; a nanoparticle held in aggregate by bridge molecules, whereinat least a portion of the bridge molecules comprises a polynucleotide,and wherein the solution undergoes a color shift when the nanoparticleis disbursed in solution; a quantum dot or fluorophore linked to one ormore quencher molecules by a linking molecule, wherein at least aportion of the linking molecule comprises a polynucleotide; apolynucleotide in complex with an intercalating agent, wherein theintercalating agent changes absorbance upon cleavage of thepolynucleotide; or two fluorophores tethered by a polynucleotide thatundergo a shift in fluorescence when released from the polynucleotide.

The aptamer may comprise a polynucleotide-tethered inhibitor thatsequesters an enzyme, wherein the enzyme generates a detectable signalupon release from the aptamer or polynucleotide-tethered inhibitor byacting upon a substrate; or may be an inhibitory aptamer that inhibitsan enzyme and prevents the enzyme from catalyzing generation of adetectable signal from a substrate or wherein thepolynucleotide-tethered inhibitor inhibits an enzyme and prevents theenzyme from catalyzing generation of a detectable signal from asubstrate; or sequesters a pair of agents that when released from theaptamers combine to generate a detectable signal.

The nanoparticle may be a colloidal metal. The colloidal metal materialmay include water-insoluble metal particles or metallic compoundsdispersed in a liquid, a hydrosol, or a metal sol. The colloidal metalmay be selected from the metals in groups IA, IB, IIB and IIIB of theperiodic table, as well as the transition metals, especially those ofgroup VIII. Preferred metals include gold, silver, aluminum, ruthenium,zinc, iron, nickel and calcium. Other suitable metals also include thefollowing in all of their various oxidation states: lithium, sodium,magnesium, potassium, scandium, titanium, vanadium, chromium, manganese,cobalt, copper, gallium, strontium, niobium, molybdenum, palladium,indium, tin, tungsten, rhenium, platinum, and gadolinium. The metals arepreferably provided in ionic form, derived from an appropriate metalcompound, for example the Al³⁺, Ru³⁺, Zn²⁺, Ni²⁺ and Ca²⁺ ions.

When the RNA bridge is cut by the activated CRISPR effector, thebeforementioned color shift is observed. In certain example embodimentsthe particles are colloidal metals. In certain other exampleembodiments, the colloidal metal is a colloidal gold. In certain exampleembodiments, the colloidal nanoparticles are 15 nm gold nanoparticles(AuNPs). Due to the unique surface properties of colloidal goldnanoparticles, maximal absorbance is observed at 520 nm when fullydispersed in solution and appear red in color to the naked eye. Uponaggregation of AuNPs, they exhibit a red-shift in maximal absorbance andappear darker in color, eventually precipitating from solution as a darkpurple aggregate.

In some embodiments, at least one guide polynucleotide comprises amismatch. The mismatch may be up- or downstream of a single nucleotidevariation on the one or more guide sequences. In certain embodiments,modulations of cleavage efficiency can be exploited by introduction ofmismatches, e.g. 1 or more mismatches, such as 1 or 2 mismatches betweenspacer sequence and target sequence, including the position of themismatch along the spacer/target. The more central (i.e. not 3′ or 5′)for instance a double mismatch is, the more cleavage efficiency isaffected. Accordingly, by choosing mismatch position along the spacer,cleavage efficiency can be modulated. By means of example, if less than100% cleavage of targets is desired (e.g. in a cell population), 1 ormore, such as preferably 2 mismatches between spacer and target sequencemay be introduced in the spacer sequences. The more central along thespacer of the mismatch position, the lower the cleavage percentage. Incertain example embodiments, the cleavage efficiency may be exploited todesign single guides that can distinguish two or more targets that varyby a single nucleotide, such as a single nucleotide polymorphism (SNP),variation, or (point) mutation. The CRISPR effector may have reducedsensitivity to SNPs (or other single nucleotide variations) and continueto cleave SNP targets with a certain level of efficiency. Thus, for twotargets, or a set of targets, a guide RNA may be designed with anucleotide sequence that is complementary to one of the targets i.e. theon-target SNP. The guide RNA is further designed to have a syntheticmismatch. As used herein a “synthetic mismatch” refers to anon-naturally occurring mismatch that is introduced upstream ordownstream of the naturally occurring SNP, such as at most 5 nucleotidesupstream or downstream, for instance 4, 3, 2, or 1 nucleotide upstreamor downstream, preferably at most 3 nucleotides upstream or downstream,more preferably at most 2 nucleotides upstream or downstream, mostpreferably 1 nucleotide upstream or downstream (i.e. adjacent the SNP).When the CRISPR effector binds to the on-target SNP, only a singlemismatch will be formed with the synthetic mismatch and the CRISPReffector will continue to be activated and a detectable signal produced.When the guide RNA hybridizes to an off-target SNP, two mismatches willbe formed, the mismatch from the SNP and the synthetic mismatch, and nodetectable signal generated. Thus, the systems disclosed herein may bedesigned to distinguish SNPs within a population. For, example thesystems may be used to distinguish pathogenic strains that differ by asingle SNP or detect certain disease specific SNPs, such as but notlimited to, disease associated SNPs, such as without limitation cancerassociated SNPs.

In certain embodiments, the guide RNA is designed such that the SNP islocated on position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 of thespacer sequence (starting at the 5′ end). In certain embodiments, theguide RNA is designed such that the SNP is located on position 1, 2, 3,4, 5, 6, 7, 8, or 9 of the spacer sequence (starting at the 5′ end). Incertain embodiments, the guide RNA is designed such that the SNP islocated on position 2, 3, 4, 5, 6, or 7 of the spacer sequence (startingat the 5′ end). In certain embodiments, the guide RNA is designed suchthat the SNP is located on position 3, 4, 5, or 6 of the spacer sequence(starting at the 5′ end). In certain embodiments, the guide RNA isdesigned such that the SNP is located on position 3 of the spacersequence (starting at the 5′ end).

In certain embodiments, the guide RNA is designed such that the mismatch(e.g. The synthetic mismatch, i.e. an additional mutation besides a SNP)is located on position 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 of thespacer sequence (starting at the 5′ end). In certain embodiments, theguide RNA is designed such that the mismatch is located on position 1,2, 3, 4, 5, 6, 7, 8, or 9 of the spacer sequence (starting at the 5′end). In certain embodiments, the guide RNA is designed such that themismatch is located on position 4, 5, 6, or 7 of the spacer sequence(starting at the 5′ end. In certain embodiments, the guide RNA isdesigned such that the mismatch is located on position 5 of the spacersequence (starting at the 5′ end).

In certain embodiments, the guide RNA is designed such that the mismatchis located 2 nucleotides upstream of the SNP (i.e. one interveningnucleotide). In certain embodiments, the guide RNA is designed such thatthe mismatch is located 2 nucleotides downstream of the SNP (i.e. oneintervening nucleotide). In certain embodiments, the guide RNA isdesigned such that the mismatch is located on position 5 of the spacersequence (starting at the 5′ end) and the SNP is located on position 3of the spacer sequence (starting at the 5′ end).

Transcript Tracking

In another aspect, the present disclosure provides compositions andmethods for transcript tracking. In some embodiments, transcripttracking allows researchers to visualize transcripts in cells, tissues,organs or animals, providing important spatio-temporal informationregarding RNA dynamics and function. An example approach is shown inFIG. 102.

In some embodiments, the compositions may be a CRISPR-Cas protein hereinwith one or more labels, or a CRISPR-Cas system comprising such labeledCRISPR-Cas protein. The CRISPR-Cas protein or system may bind to one ormore transcripts such that the transcripts may be detected (e.g.,visualized) using the label on the CRISPR-Cas protein.

In some embodiments, the present disclosure includes a system forexpressing a CRISPR-Cas protein with one or more polypeptides orpolynucleotide labels. The system may comprise polynucleotides encodingthe CRISPR-Cas protein and/or the labels. The system may further includevector systems comprising such polynucleotides. For example, aCRISPR-Cas protein may be fused with a fluorescent protein or a fragmentthereof. Examples of fluorescent proteins include GFP proteins, EGFP,Azami-Green, Kaede, ZsGreen1 and CopGFP; CFP proteins, such as Cerulean,mCFP, AmCyan1, MiCy, and CyPet; BFP proteins such as EBFP; YFP proteinssuch as EYFP, YPet, Venus, ZsYellow, and mCitrine; OFP proteins such ascOFP, mKO, and mOrange; red fluorescent protein, or RFP; red or far-redfluorescent proteins from any other species, such as Heteractis reefcoral and Actinia or Entacmaea sea anemone, as well as variants thereof.RFPs include, for example, Discosoma variants, such as mRFP1, mCherry,tdTomato, mStrawberry, mTangerine, DsRed2, and DsRed-T1, AnthomedusaJ-Red and Anemonia AsRed2. Far-red fluorescent proteins include, forexample, Actinia AQ143, Entacmaea eqFP611, Discosoma variants such asmPlum and mRasberry, and Heteractis HcRedl and t-HcRed.

In some cases, the systems for expressing the labeled CRISPR-Cas proteinmay be inducible. For example, the systems may comprise polynucleotidesencoding the CRISPR-Cas protein and/or labels under control of aregulatory element herein, e.g., inducible promoters. Such systems mayallow spatial and/or temporal control of the expression of the labels,thus enabling spatial and/or temporal control of transcript tracking.

In certain cases, the CRISPR-Cas may be labeled with a detectable tag.The labeling may be performed in cells. Alternatively or additionally,the labeling may be performed first and the labeled CRISPR-Cas proteinis then delivered into cells, tissues, organs, or organs.

The detectable tags may be detected (e.g., visualized by imaging,ultrasound, or MRI). Examples of such detectable tags include detectableoligonucleotide tags may be, but are not limited to, oligonucleotidescomprising unique nucleotide sequences, oligonucleotides comprisingdetectable moieties, and oligonucleotides comprising both uniquenucleotide sequences and detectable moieties. In some cases, thedetectable tag comprises a labeling substance, which is detectable byspectroscopic, photochemical, biochemical, immunochemical, electrical,optical or chemical means. Such tags include biotin for staining withlabeled streptavidin conjugate, magnetic beads (e.g., Dynabeads®),fluorescent dyes (e.g., fluorescein, texas red, rhodamine, greenfluorescent protein, and the like), radiolabels (e.g., ³H, 125I, ³⁵S,¹⁴C, or ³²P), enzymes (e.g., horse radish peroxidase, alkalinephosphatase and others commonly used in an ELISA), and calorimetriclabels such as colloidal gold or colored glass or plastic (e.g.,polystyrene, polypropylene, latex, etc.) beads. Detectable tags may bedetected by many methods. For example, radiolabels may be detected usingphotographic film or scintillation counters, fluorescent markers may bedetected using a photodetector to detect emitted light. Enzymatic labelsare typically detected by providing the enzyme with a substrate anddetecting, the reaction product produced by the action of the enzyme onthe substrate, and calorimetric labels are detected by simplyvisualizing the colored label. Examples of the labeling substance whichmay be employed include labeling substances known to those skilled inthe art, such as fluorescent dyes, enzymes, coenzymes, chemiluminescentsubstances, and radioactive substances. Specific examples includeradioisotopes (e.g., ³²P, ¹⁴C, ¹²⁵I, ³H, and ¹³¹I) fluorescein,rhodamine, dansyl chloride, umbelliferone, luciferase, peroxidase,alkaline phosphatase, β-galactosidase, β-glucosidase, horseradishperoxidase, glucoamylase, lysozyme, saccharide oxidase, microperoxidase,biotin, and ruthenium. In the case where biotin is employed as alabeling substance, preferably, after addition of a biotin-labeledantibody, streptavidin bound to an enzyme (e.g., peroxidase) is furtheradded. Advantageously, the label is a fluorescent label. Examples offluorescent labels include, but are not limited to, Atto dyes,4-acetamido-4′-isothiocyanatostilbene-2,2′disulfonic acid; acridine andderivatives: acridine, acridine isothiocyanate;5-(2′-aminoethyl)aminonaphthalene-1-sulfonic acid (EDANS);4-amino-N-[3-vinyl sulfonyl)phenyl]naphthalimide-3,5 di sulfonate;N-(4-anilino-1-naphthyl)maleimide; anthranilamide; BODIPY; BrilliantYellow; coumarin and derivatives; coumarin, 7-amino-4-methylcoumarin(AMC, Coumarin 120), 7-amino-4-trifluoromethylcouluarin (Coumaran 151);cyanine dyes; cyanosine; 4′,6-diaminidino-2-phenylindole (DAPI);5′5″-dibromopyrogallol-sulfonaphthalein (Bromopyrogallol Red);7-diethylamino-3-(4′-isothiocyanatophenyl)-4-methylcoumarin;diethylenetriamine pentaacetate;4,4′-diisothiocyanatodihydro-stilbene-2,2′-disulfonic acid;4,4′-diisothiocyanatostilbene-2,2′-disulfonic acid;5-[dimethylamino]naphthalene-1-sulfonyl chloride (DNS, dansylchloride);4-dimethylaminophenylazophenyl-4′-isothiocyanate (DABITC); eosin andderivatives; eosin, eosin isothiocyanate, erythrosin and derivatives;erythrosin B, erythrosin, isothiocyanate; ethidium; fluorescein andderivatives; 5-carboxyfluorescein (FAM),5-(4,6-dichlorotriazin-2-yl)aminofluorescein (DTAF),2′,7′-dimethoxy-4′5′-dichloro-6-carboxyfluorescein, fluorescein,fluorescein isothiocyanate, QFITC, (XRITC); fluorescamine; IR144;IR1446; Malachite Green isothiocyanate; 4-methylumbelliferoneorthocresolphthalein; nitrotyrosine; pararosaniline; Phenol Red;B-phycoerythrin; o-phthaldialdehyde; pyrene and derivatives: pyrene,pyrene butyrate, succinimidyl 1-pyrene; butyrate quantum dots; ReactiveRed 4 (Cibacron™ Brilliant Red 3B-A) rhodamine and derivatives:6-carboxy-X-rhodamine (ROX), 6-carboxyrhodamine (R6G), lissaminerhodamine B sulfonyl chloride rhodamine (Rhod), rhodamine B, rhodamine123, rhodamine X isothiocyanate, sulforhodamine B, sulforhodamine 101,sulfonyl chloride derivative of sulforhodamine 101 (Texas Red);N,N,N′,N′ tetramethyl-6-carboxyrhodamine (TAMRA); tetramethyl rhodamine;tetramethyl rhodamine isothiocyanate (TRITC); riboflavin; rosolic acid;terbium chelate derivatives; Cy3; Cy5; Cy5.5; Cy7; IRD 700; IRD 800; LaJolta Blue; phthalo cyanine; and naphthalo cyanine. A fluorescent labelmay be a fluorescent protein, such as blue fluorescent protein, cyanfluorescent protein, green fluorescent protein, red fluorescent protein,yellow fluorescent protein or any photoconvertible protein. Colorimetriclabeling, bioluminescent labeling and/or chemiluminescent labeling mayfurther accomplish labeling. Labeling further may include energytransfer between molecules in the hybridization complex by perturbationanalysis, quenching, or electron transport between donor and acceptormolecules, the latter of which may be facilitated by double strandedmatch hybridization complexes. The fluorescent label may be a peryleneor a terrylen. In the alternative, the fluorescent label may be afluorescent bar code. Advantageously, the label may be light sensitive,wherein the label is light-activated and/or light cleaves the one ormore linkers to release the molecular cargo. The light-activatedmolecular cargo may be a major light-harvesting complex (LHCII). Inanother embodiment, the fluorescent label may induce free radicalformation. In some embodiments, the detectable moieties may be quantumdots.

In some embodiments, the present disclosure provides for a system fordelivery the labeled CRISPR-Cas proteins or labeled CRISPR-Cas systems.The delivery system may comprise any delivery vehicles, e.g., thosedescribed herein such as RNP, liposomes, nanoparticles, exosomes,microvesicles, nucleic acid nanoassemblies, a gene gun, an implantabledevice, or the vector systems herein.

Nucleic Acid Targeting

In certain embodiments, the CRISPR-Cas effector protein of the inventionis, or in, or comprises, or consists essentially of, or consists of, orinvolves or relates to such a protein from or as set forth in Tables1-4, wherein one or more amino acids are mutated, as described hereinelsewhere. Thus, in some embodiments, the effector protein may be aRNA-binding protein, such as a dead-Cas type effector protein, which maybe optionally functionalized as described herein for instance with antranscriptional activator or repressor domain, NLS or other functionaldomain. In some embodiments, the effector protein may be a RNA-bindingprotein that cleaves a single strand of RNA. If the RNA bound is ssRNA,then the ssRNA is fully cleaved. In some embodiments, the effectorprotein may be a RNA-binding protein that cleaves a double strand ofRNA, for example if it comprises two RNase domains. If the RNA bound isdsRNA, then the dsRNA is fully cleaved. In some embodiments, theeffector protein may be a RNA-binding protein that has nickase activity,i.e. it binds dsRNA, but only cleaves one of the RNA strands.

RNase function in CRISPR systems is known, for example mRNA targetinghas been reported for certain type III CRISPR-Cas systems (Hale et al.,2014, Genes Dev, vol. 28, 2432-2443; Hale et al., 2009, Cell, vol. 139,945-956; Peng et al., 2015, Nucleic acids research, vol. 43, 406-417)and provides significant advantages. A CRISPR-Cas system, composition ormethod targeting RNA via the present effector proteins is thus provided.

The target RNA, i.e. the RNA of interest, is the RNA to be targeted bythe present invention leading to the recruitment to, and the binding ofthe effector protein at, the target site of interest on the target RNA.The target RNA may be any suitable form of RNA. This may include, insome embodiments, mRNA. In other embodiments, the target RNA may includetRNA or rRNA.

Self-Inactivating Systems

Once all copies of RNA in a cell have been edited, continued aCRISPR-Cas effector protein expression or activity in that cell is nolonger necessary. A Self-Inactivating system that relies on the use ofRNA as to the CRISPR-Cas or crRNA as the guide target sequence can shutdown the system by preventing expression of CRISPR-Cas or complexformation.

Examples of Target RNAs

The compositions and systems herein may be used for editing varioustypes of target RNAs. Examples of target RNAs are described below.

Interfering RNA (RNAi) and microRNA (miRNA)

In other embodiments, the target RNA may include interfering RNA, i.e.RNA involved in an RNA interference pathway, such as shRNA, siRNA and soforth. In other embodiments, the target RNA may include microRNA(miRNA). Control over interfering RNA or miRNA may help reduceoff-target effects (OTE) seen with those approaches by reducing thelongevity of the interfering RNA or miRNA in vivo or in vitro.

If the effector protein and suitable guide are selectively expressed(for example spatially or temporally under the control of a suitablepromoter, for example a tissue- or cell cycle-specific promoter and/orenhancer) then this could be used to ‘protect’ the cells or systems (invivo or in vitro) from RNAi in those cells. This may be useful inneighboring tissues or cells where RNAi is not required or for thepurposes of comparison of the cells or tissues where the effectorprotein and suitable guide are and are not expressed (i.e. where theRNAi is not controlled and where it is, respectively). The effectorprotein may be used to control or bind to molecules comprising orconsisting of RNA, such as ribozymes, ribosomes or riboswitches. Inembodiments of the invention, the RNA guide can recruit the effectorprotein to these molecules so that the effector protein is able to bindto them.

Ribosomal RNA (rRNA)

For example, azalide antibiotics such as azithromycin, are well known.They target and disrupt the 50S ribosomal subunit. The present effectorprotein, together with a suitable guide RNA to target the 50S ribosomalsubunit, may be, in some embodiments, recruited to and bind to the 50Sribosomal subunit. Thus, the present effector protein in concert with asuitable guide directed at a ribosomal (especially the 50s ribosomalsubunit) target is provided. Use of this use effector protein in concertwith the suitable guide directed at the ribosomal (especially the 50sribosomal subunit) target may include antibiotic use. In particular, theantibiotic use is analogous to the action of azalide antibiotics, suchas azithromycin. In some embodiments, prokaryotic ribosomal subunits,such as the 70S subunit in prokaryotes, the 50S subunit mentioned above,the 30S subunit, as well as the 16S and 5S subunits may be targeted. Inother embodiments, eukaryotic ribosomal subunits, such as the 80Ssubunit in eukaryotes, the 60S subunit, the 40S subunit, as well as the28S, 18S. 5.8S and 5S subunits may be targeted.

The effector protein may be a RNA-binding protein, optionallyfunctionalized, as described herein. In some embodiments, the effectorprotein may be a RNA-binding protein that cleaves a single strand ofRNA. In either case, but particularly where the RNA-binding proteincleaves a single strand of RNA, then ribosomal function may be modulatedand, in particular, reduced or destroyed. This may apply to anyribosomal RNA and any ribosomal subunit and the sequences of rRNA arewell known.

Control of ribosomal activity is thus envisaged through use of thepresent effector protein in concert with a suitable guide to theribosomal target. This may be through cleavage of, or binding to, theribosome. In particular, reduction of ribosomal activity is envisaged.This may be useful in assaying ribosomal function in vivo or in vitro,but also as a means of controlling therapies based on ribosomalactivity, in vivo or in vitro. Furthermore, control (i.e. reduction) ofprotein synthesis in an in vivo or in vitro system is envisaged, suchcontrol including antibiotic and research and diagnostic use.

Riboswitches

A riboswitch (also known as an aptozyme) is a regulatory segment of amessenger RNA molecule that binds a small molecule. This typicallyresults in a change in production of the proteins encoded by the mRNA.Thus, control of riboswitch activity is thus envisaged through use ofthe present effector protein in concert with a suitable guide to theriboswitch target. This may be through cleavage of, or binding to, theriboswitch. In particular, reduction of riboswitch activity isenvisaged. This may be useful in assaying riboswitch function in vivo orin vitro, but also as a means of controlling therapies based onriboswitch activity, in vivo or in vitro. Furthermore, control (i.e.reduction) of protein synthesis in an in vivo or in vitro system isenvisaged. This control, as for rRNA may include antibiotic and researchand diagnostic use.

Ribozymes

Ribozymes are RNA molecules having catalytic properties, analogous toenzymes (which are of course proteins). As ribozymes, both naturallyoccurring and engineered, comprise or consist of RNA, they may also betargeted by the present RNA-binding effector protein. In someembodiments, the effector protein may be a RNA-binding protein cleavesthe ribozyme to thereby disable it. Control of ribozymal activity isthus envisaged through use of the present effector protein in concertwith a suitable guide to the ribozymal target. This may be throughcleavage of, or binding to, the ribozyme. In particular, reduction ofribozymal activity is envisaged. This may be useful in assayingribozymal function in vivo or in vitro, but also as a means ofcontrolling therapies based on ribozymal activity, in vivo or in vitro.

RNA-Targeting Applications Gene Expression, Including RNA Processing

The effector protein may also be used, together with a suitable guide,to target gene expression, including via control of RNA processing. Thecontrol of RNA processing may include RNA processing reactions such asRNA splicing, including alternative splicing, via targeting of RNApol;viral replication (in particular of satellite viruses, bacteriophagesand retroviruses, such as HBV, HBC and HIV and others listed herein)including virioids in plants; and tRNA biosynthesis. The effectorprotein and suitable guide may also be used to control RNA activation(RNAa). RNAa leads to the promotion of gene expression, so control ofgene expression may be achieved that way through disruption or reductionof RNAa and thus less promotion of gene expression.

RNAi Screens

Identifying gene products whose knockdown is associated with phenotypicchanges, biological pathways can be interrogated and the constituentparts identified, via RNAi screens. Control may also be exerted over orduring these screens by use of the effector protein and suitable guideto remove or reduce the activity of the RNAi in the screen and thusreinstate the activity of the (previously interfered with) gene product(by removing or reducing the interference/repression).

Satellite RNAs (satRNAs) and satellite viruses may also be treated.

Control herein with reference to RNase activity generally meansreduction, negative disruption or known-down or knock out.

In Vivo RNA Applications Inhibition of Gene Expression

The target-specific RNases provided herein allow for very specificcutting of a target RNA. The interference at RNA level allows formodulation both spatially and temporally and in a non-invasive way, asthe genome is not modified.

A number of diseases have been demonstrated to be treatable by mRNAtargeting. While most of these studies relate to administration ofsiRNA, it is clear that the RNA targeting effector proteins providedherein can be applied in a similar way.

Examples of mRNA targets (and corresponding disease treatments) areVEGF, VEGF-R1 and RTP801 (in the treatment of AMD and/or DME), Caspase 2(in the treatment of Naion)ADRB2 (in the treatment of intraocularpressure), TRPVI (in the treatment of Dry eye syndrome, Syk kinase (inthe treatment of asthma), Apo B (in the treatment ofhypercholesterolemia), PLK1, KSP and VEGF (in the treatment of solidtumors), Ber-Abl (in the treatment of CML)(Burnett and Rossi Chem Biol.2012, 19(1): 60-71)). Similarly, RNA targeting has been demonstrated tobe effective in the treatment of RNA-virus mediated diseases such as HIV(targeting of HIV Tet and Rev), RSV (targeting of RSV nucleocapsid) andHCV (targeting of miR-122) (Burnett and Rossi Chem Biol. 2012, 19(1):60-71).

It is further envisaged that the RNA targeting effector protein of theinvention can be used for mutation specific or allele specificknockdown. Guide RNA's can be designed that specifically target asequence in the transcribed mRNA comprising a mutation or anallele-specific sequence. Such specific knockdown is particularlysuitable for therapeutic applications relating to disorders associatedwith mutated or allele-specific gene products. For example, most casesof familial hypobetalipoproteinemia (FHBL) are caused by mutations inthe ApoB gene. This gene encodes two versions of the apolipoprotein Bprotein: a short version (ApoB-48) and a longer version (ApoB-100).Several ApoB gene mutations that lead to FHBL cause both versions ofApoB to be abnormally short. Specifically targeting and knockdown ofmutated ApoB mRNA transcripts with an RNA targeting effector protein ofthe invention may be beneficial in treatment of FHBL. As anotherexample, Huntington's disease (HD) is caused by an expansion of CAGtriplet repeats in the gene coding for the Huntingtin protein, whichresults in an abnormal protein. Specifically targeting and knockdown ofmutated or allele-specific mRNA transcripts encoding the Huntingtinprotein with an RNA targeting effector protein of the invention may bebeneficial in treatment of HD.

Modulation of Gene Expression Through Modulation of RNA Function

Apart from a direct effect on gene expression through cleavage of themRNA, RNA targeting can also be used to impact specific aspects of theRNA processing within the cell, which may allow a more subtle modulationof gene expression. Generally, modulation can for instance be mediatedby interfering with binding of proteins to the RNA, such as for instanceblocking binding of proteins, or recruiting RNA binding proteins.Indeed, modulations can be ensured at different levels such as splicing,transport, localization, translation and turnover of the mRNA. Similarlyin the context of therapy, it can be envisaged to address (pathogenic)malfunctioning at each of these levels by using RNA-specific targetingmolecules. In these embodiments it is in many cases preferred that theRNA targeting protein is a “dead” CRISPR-Cas that has lost the abilityto cut the RNA target but maintains its ability to bind thereto, such asthe mutated forms of CRISPR-Cas described herein.

a) Alternative Splicing

Many of the human genes express multiple mRNAs as a result ofalternative splicing. Different diseases have been shown to be linked toaberrant splicing leading to loss of function or gain of function of theexpressed gene. While some of these diseases are caused by mutationsthat cause splicing defects, a number of these are not. One therapeuticoption is to target the splicing mechanism directly. The RNA targetingeffector proteins described herein can for instance be used to block orpromote slicing, include or exclude exons and influence the expressionof specific isoforms and/or stimulate the expression of alternativeprotein products. Such applications are described in more detail below.

A RNA targeting effector protein binding to a target RNA can stericallyblock access of splicing factors to the RNA sequence. The RNA targetingeffector protein targeted to a splice site may block splicing at thesite, optionally redirecting splicing to an adjacent site. For instancea RNA targeting effector protein binding to the 5′ splice site bindingcan block the recruitment of the U1 component of the spliceosome,favoring the skipping of that exon. Alternatively, a RNA targetingeffector protein targeted to a splicing enhancer or silencer can preventbinding of transacting regulatory splicing factors at the target siteand effectively block or promote splicing. Exon exclusion can further beachieved by recruitment of ILF2/3 to precursor mRNA near an exon by anRNA targeting effector protein as described herein. As yet anotherexample, a glycine rich domain can be attached for recruitment of hnRNPA1 and exon exclusion (Del Gatto-Konczak et al. Mol Cell Biol. 1999January; 19(1):251-60).

In certain embodiments, through appropriate selection of gRNA, specificsplice variants may be targeted, while other splice variants will not betargeted

In some cases the RNA targeting effector protein can be used to promoteslicing (e.g. where splicing is defective). For instance a RNA targetingeffector protein can be associated with an effector capable ofstabilizing a splicing regulatory stem-loop in order to furthersplicing. The RNA targeting effector protein can be linked to aconsensus binding site sequence for a specific splicing factor in orderto recruit the protein to the target DNA.

Examples of diseases which have been associated with aberrant splicinginclude, but are not limited to Paraneoplastic Opsoclonus MyoclonusAtaxia (or POMA), resulting from a loss of Nova proteins which regulatesplicing of proteins that function in the synapse, and Cystic Fibrosis,which is caused by defective splicing of a cystic fibrosis transmembraneconductance regulator, resulting in the production of nonfunctionalchloride channels. In other diseases aberrant RNA splicing results ingain-of-function. This is the case for instance in myotonic dystrophywhich is caused by a CUG triplet-repeat expansion (from 50 to >1500repeats) in the 3′UTR of an mRNA, causing splicing defects.

The RNA targeting effector protein can be used to include an exon byrecruiting a splicing factor (such as U1) to a 5′ splicing site topromote excision of introns around a desired exon. Such recruitmentcould be mediated trough a fusion with an arginine/serine rich domain,which functions as splicing activator (Gravely B R and Maniatis T, MolCell. 1998 (5):765-71).

It is envisaged that the RNA targeting effector protein can be used toblock the splicing machinery at a desired locus, resulting in preventingexon recognition and the expression of a different protein product. Anexample of a disorder that may treated is Duchenne muscular dystrophy(DMD), which is caused by mutations in the gene encoding for thedystrophin protein. Almost all DMD mutations lead to frameshifts,resulting in impaired dystrophin translation. The RNA targeting effectorprotein can be paired with splice junctions or exonic splicing enhancers(ESEs) thereby preventing exon recognition, resulting in the translationof a partially functional protein. This converts the lethal Duchennephenotype into the less severe Becker phenotype.

b) RNA Modification

RNA editing is a natural process whereby the diversity of gene productsof a given sequence is increased by minor modification in the RNA.Typically, the modification involves the conversion of adenosine (A) toinosine (I), resulting in an RNA sequence which is different from thatencoded by the genome. RNA modification is generally ensured by the ADARenzyme, whereby the pre-RNA target forms an imperfect duplex RNA bybase-pairing between the exon that contains the adenosine to be editedand an intronic non-coding element. A classic example of A-I editing isthe glutamate receptor GluR-B mRNA, whereby the change results inmodified conductance properties of the channel (Higuchi M, et al. Cell.1993; 75:1361-70).

In humans, a heterozygous functional-null mutation in the ADAR1 geneleads to a skin disease, human pigmentary genodermatosis (Miyamura Y, etal. Am J Hum Genet. 2003; 73:693-9). It is envisaged that the RNAtargeting effector proteins of the present invention can be used tocorrect malfunctioning RNA modification.

c) Polyadenylation

Polyadenylation of an mRNA is important for nuclear transport,translation efficiency and stability of the mRNA, and all of these, aswell as the process of polyadenylation, depend on specific RBPs. Mosteukaryotic mRNAs receive a 3′ poly(A) tail of about 200 nucleotidesafter transcription. Polyadenylation involves different RNA-bindingprotein complexes which stimulate the activity of a poly(A)polymerase(Minvielle-Sebastia L et al. Curr Opin Cell Biol. 1999; 11:352-7). It isenvisaged that the RNA-targeting effector proteins provided herein canbe used to interfere with or promote the interaction between theRNA-binding proteins and RNA.

Examples of diseases which have been linked to defective proteinsinvolved in polyadenylation are oculopharyngeal muscular dystrophy(OPMD) (Brais B, et al. Nat Genet. 1998; 18:164-7).

d) RNA Export

After pre-mRNA processing, the mRNA is exported from the nucleus to thecytoplasm. This is ensured by a cellular mechanism which involves thegeneration of a carrier complex, which is then translocated through thenuclear pore and releases the mRNA in the cytoplasm, with subsequentrecycling of the carrier.

Overexpression of proteins (such as TAP) which play a role in the exportof RNA has been found to increase export of transcripts that areotherwise inefficiently exported in Xenopus (Katahira J, et al. EMBO J.1999; 18:2593-609).

e) mRNA Localization

mRNA localization ensures spatially regulated protein production.Localization of transcripts to a specific region of the cell can beensured by localization elements. In particular embodiments, it isenvisaged that the effector proteins described herein can be used totarget localization elements to the RNA of interest. The effectorproteins can be designed to bind the target transcript and shuttle themto a location in the cell determined by its peptide signal tag. Moreparticularly for instance, a RNA targeting effector protein fused to anuclear localization signal (NLS) can be used to alter RNA localization.

Further examples of localization signals include the zipcode bindingprotein (ZBP1) which ensures localization of β-actin to the cytoplasm inseveral asymmetric cell types, KDEL retention sequence (localization toendoplasmic reticulum), nuclear export signal (localization tocytoplasm), mitochondrial targeting signal (localization tomitochondria), peroxisomal targeting signal (localization to peroxisome)and m6A marking/YTHDF2 (localization to p-bodies). Other approaches thatare envisaged are fusion of the RNA targeting effector protein withproteins of known localization (for instance membrane, synapse).

Alternatively, the effector protein according to the invention may forinstance be used in localization-dependent knockdown. By fusing theeffector protein to an appropriate localization signal, the effector istargeted to a particular cellular compartment. Only target RNAs residingin this compartment will effectively be targeted, whereas otherwiseidentical targets, but residing in a different cellular compartment willnot be targeted, such that a localization dependent knockdown can beestablished.

f) Translation

The RNA targeting effector proteins described herein can be used toenhance or repress translation. It is envisaged that upregulatingtranslation is a very robust way to control cellular circuits. Further,for functional studies a protein translation screen can be favorableover transcriptional upregulation screens, which have the shortcomingthat upregulation of transcript does not translate into increasedprotein production.

It is envisaged that the RNA targeting effector proteins describedherein can be used to bring translation initiation factors, such asEIF4G in the vicinity of the 5′ untranslated repeat (5′UTR) of amessenger RNA of interest to drive translation (as described in DeGregorio et al. EMBO J. 1999; 18(17):4865-74 for a non-reprogrammableRNA binding protein). As another example GLD2, a cytoplasmic poly(A)polymerase, can be recruited to the target mRNA by an RNA targetingeffector protein. This would allow for directed polyadenylation of thetarget mRNA thereby stimulating translation.

Similarly, the RNA targeting effector proteins envisaged herein can beused to block translational repressors of mRNA, such as ZBP1(Huttelmaier S, et al. Nature. 2005; 438:512-5). By binding totranslation initiation site of a target RNA, translation can be directlyaffected.

In addition, fusing the RNA targeting effector proteins to a proteinthat stabilizes mRNAs, e.g. by preventing degradation thereof such asRNase inhibitors, it is possible to increase protein production from thetranscripts of interest.

It is envisaged that the RNA targeting effector proteins describedherein can be used to repress translation by binding in the 5′ UTRregions of a RNA transcript and preventing the ribosome from forming andbeginning translation.

Further, the RNA targeting effector protein can be used to recruit Caf1,a component of the CCR4-NOT deadenylase complex, to the target mRNA,resulting in deadenylation or the target transcript and inhibition ofprotein translation.

For instance, the RNA targeting effector protein of the invention can beused to increase or decrease translation of therapeutically relevantproteins. Examples of therapeutic applications wherein the RNA targetingeffector protein can be used to downregulate or upregulate translationare in amyotrophic lateral sclerosis (ALS) and cardiovascular disorders.Reduced levels of the glial glutamate transporter EAAT2 have beenreported in ALS motor cortex and spinal cord, as well as multipleabnormal EAAT2 mRNA transcripts in ALS brain tissue. Loss of the EAAT2protein and function thought to be the main cause of excitotoxicity inALS. Restoration of EAAT2 protein levels and function may providetherapeutic benefit. Hence, the RNA targeting effector protein can bebeneficially used to upregulate the expression of EAAT2 protein, e.g. byblocking translational repressors or stabilizing mRNA as describedabove. Apolipoprotein A1 is the major protein component of high densitylipoprotein (HDL) and ApoA1 and HDL are generally considered asatheroprotective. It is envisaged that the RNA targeting effectorprotein can be beneficially used to upregulate the expression of ApoA1,e.g. by blocking translational repressors or stabilizing mRNA asdescribed above.

g) mRNA Turnover

Translation is tightly coupled to mRNA turnover and regulated mRNAstability. Specific proteins have been described to be involved in thestability of transcripts (such as the ELAV/Hu proteins in neurons, KeeneJ D, 1999, Proc Natl Acad Sci USA. 96:5-7) and tristetraprolin (TTP).These proteins stabilize target mRNAs by protecting the messages fromdegradation in the cytoplasm (Peng S S et al., 1988, EMBO J.17:3461-70).

It can be envisaged that the RNA-targeting effector proteins of thepresent invention can be used to interfere with or to promote theactivity of proteins acting to stabilize mRNA transcripts, such thatmRNA turnover is affected. For instance, recruitment of human TTP to thetarget RNA using the RNA targeting effector protein would allow foradenylate-uridylate-rich element (AU-rich element) mediatedtranslational repression and target degradation. AU-rich elements arefound in the 3′ UTR of many mRNAs that code for proto-oncogenes, nucleartranscription factors, and cytokines and promote RNA stability. Asanother example, the RNA targeting effector protein can be fused to HuR,another mRNA stabilization protein (Hinman M N and Lou H, Cell Mol LifeSci 2008; 65:3168-81), and recruit it to a target transcript to prolongits lifetime or stabilize short-lived mRNA.

It is further envisaged that the RNA-targeting effector proteinsdescribed herein can be used to promote degradation of targettranscripts. For instance, m6A methyltransferase can be recruited to thetarget transcript to localize the transcript to P-bodies leading todegradation of the target.

As yet another example, an RNA targeting effector protein as describedherein can be fused to the non-specific endonuclease domain PilTN-terminus (PIN), to recruit it to a target transcript and allowdegradation thereof.

Patients with paraneoplastic neurological disorder (PND)-associatedencephalomyelitis and neuropathy are patients who develop autoantibodiesagainst Hu-proteins in tumors outside of the central nervous system(Szabo A et al. 1991, Cell; 67:325-33 which then cross the blood-brainbarrier. It can be envisaged that the RNA-targeting effector proteins ofthe present invention can be used to interfere with the binding ofauto-antibodies to mRNA transcripts.

Patients with dystrophy type 1 (DM1), caused by the expansion of (CUG)nin the 3′ UTR of dystrophia myotonica-protein kinase (DMPK) gene, arecharacterized by the accumulation of such transcripts in the nucleus. Itis envisaged that the RNA targeting effector proteins of the inventionfused with an endonuclease targeted to the (CUG)n repeats could inhibitsuch accumulation of aberrant transcripts.

h) Interaction with Multi-Functional Proteins

Some RNA-binding proteins bind to multiple sites on numerous RNAs tofunction in diverse processes. For instance, the hnRNP A1 protein hasbeen found to bind exonic splicing silencer sequences, antagonizing thesplicing factors, associate with telomere ends (thereby stimulatingtelomere activity) and bind miRNA to facilitate Drosha-mediatedprocessing thereby affecting maturation. It is envisaged that theRNA-binding effector proteins of the present invention can interferewith the binding of RNA-binding proteins at one or more locations.

i) RNA Folding

RNA adopts a defined structure in order to perform its biologicalactivities. Transitions in conformation among alternative tertiarystructures are critical to most RNA-mediated processes. However, RNAfolding can be associated with several problems. For instance, RNA mayhave a tendency to fold into, and be upheld in, improper alternativeconformations and/or the correct tertiary structure may not besufficiently thermodynamically favored over alternative structures. TheRNA targeting effector protein, in particular a cleavage-deficient ordead RNA targeting protein, of the invention may be used to directfolding of (m)RNA and/or capture the correct tertiary structure thereof.

Use of RNA-Targeting Effector Protein in Modulating Cellular Status

In certain embodiments CRISPR-Cas in a complex with crRNA is activatedupon binding to target RNA and subsequently cleaves any nearby ssRNAtargets (i.e. “collateral” or “bystander” effects). CRISPR-Cas, onceprimed by the cognate target, can cleave other (non-complementary) RNAmolecules. Such promiscuous RNA cleavage could potentially causecellular toxicity, or otherwise affect cellular physiology or cellstatus.

Accordingly, in certain embodiments, the non-naturally occurring orengineered composition, vector system, or delivery systems as describedherein are used for or are for use in induction of cell dormancy. Incertain embodiments, the non-naturally occurring or engineeredcomposition, vector system, or delivery systems as described herein areused for or are for use in induction of cell cycle arrest. In certainembodiments, the non-naturally occurring or engineered composition,vector system, or delivery systems as described herein are used for orare for use in reduction of cell growth and/or cell proliferation, Incertain embodiments, the non-naturally occurring or engineeredcomposition, vector system, or delivery systems as described herein areused for or are for use in induction of cell anergy. In certainembodiments, the non-naturally occurring or engineered composition,vector system, or delivery systems as described herein are used for orare for use in induction of cell apoptosis. In certain embodiments, thenon-naturally occurring or engineered composition, vector system, ordelivery systems as described herein are used for or are for use ininduction of cell necrosis. In certain embodiments, the non-naturallyoccurring or engineered composition, vector system, or delivery systemsas described herein are used for or are for use in induction of celldeath. In certain embodiments, the non-naturally occurring or engineeredcomposition, vector system, or delivery systems as described herein areused for or are for use in induction of programmed cell death.

In certain embodiments, the invention relates to a method for inductionof cell dormancy comprising introducing or inducing the non-naturallyoccurring or engineered composition, vector system, or delivery systemsas described herein. In certain embodiments, the invention relates to amethod for induction of cell cycle arrest comprising introducing orinducing the non-naturally occurring or engineered composition, vectorsystem, or delivery systems as described herein. In certain embodiments,the invention relates to a method for reduction of cell growth and/orcell proliferation comprising introducing or inducing the non-naturallyoccurring or engineered composition, vector system, or delivery systemsas described herein. In certain embodiments, the invention relates to amethod for induction of cell anergy comprising introducing or inducingthe non-naturally occurring or engineered composition, vector system, ordelivery systems as described herein. In certain embodiments, theinvention relates to a method for induction of cell apoptosis comprisingintroducing or inducing the non-naturally occurring or engineeredcomposition, vector system, or delivery systems as described herein. Incertain embodiments, the invention relates to a method for induction ofcell necrosis comprising introducing or inducing the non-naturallyoccurring or engineered composition, vector system, or delivery systemsas described herein. In certain embodiments, the invention relates to amethod for induction of cell death comprising introducing or inducingthe non-naturally occurring or engineered composition, vector system, ordelivery systems as described herein. In certain embodiments, theinvention relates to a method for induction of programmed cell deathcomprising introducing or inducing the non-naturally occurring orengineered composition, vector system, or delivery systems as describedherein.

The methods and uses as described herein may be therapeutic orprophylactic and may target particular cells, cell (sub)populations, orcell/tissue types. In particular, the methods and uses as describedherein may be therapeutic or prophylactic and may target particularcells, cell (sub)populations, or cell/tissue types expressing one ormore target sequences, such as one or more particular target RNA (e.g.ss RNA). Without limitation, target cells may for instance be cancercells expressing a particular transcript, e.g. neurons of a given class,(immune) cells causing e.g. autoimmunity, or cells infected by aspecific (e.g. viral) pathogen, etc.

Accordingly, in certain embodiments, the invention relates to a methodfor treating a pathological condition characterized by the presence ofundesirable cells (host cells), comprising introducing or inducing thenon-naturally occurring or engineered composition, vector system, ordelivery systems as described herein. In certain embodiments, theinvention relates the use of the non-naturally occurring or engineeredcomposition, vector system, or delivery systems as described herein fortreating a pathological condition characterized by the presence ofundesirable cells (host cells). In certain embodiments, the inventionrelates the non-naturally occurring or engineered composition, vectorsystem, or delivery systems as described herein for use in treating apathological condition characterized by the presence of undesirablecells (host cells). It is to be understood that preferably theCRISPR-Cas system targets a target specific for the undesirable cells.In certain embodiments, the invention relates to the use of thenon-naturally occurring or engineered composition, vector system, ordelivery systems as described herein for treating, preventing, oralleviating cancer. In certain embodiments, the invention relates to thenon-naturally occurring or engineered composition, vector system, ordelivery systems as described herein for use in treating, preventing, oralleviating cancer. In certain embodiments, the invention relates to amethod for treating, preventing, or alleviating cancer comprisingintroducing or inducing the non-naturally occurring or engineeredcomposition, vector system, or delivery systems as described herein. Itis to be understood that preferably the CRISPR-Cas system targets atarget specific for the cancer cells. In certain embodiments, theinvention relates to the use of the non-naturally occurring orengineered composition, vector system, or delivery systems as describedherein for treating, preventing, or alleviating infection of cells by apathogen. In certain embodiments, the invention relates to thenon-naturally occurring or engineered composition, vector system, ordelivery systems as described herein for use in treating, preventing, oralleviating infection of cells by a pathogen. In certain embodiments,the invention relates to a method for treating, preventing, oralleviating infection of cells by a pathogen comprising introducing orinducing the non-naturally occurring or engineered composition, vectorsystem, or delivery systems as described herein. It is to be understoodthat preferably the CRISPR-Cas system targets a target specific for thecells infected by the pathogen (e.g. a pathogen derived target). Incertain embodiments, the invention relates to the use of thenon-naturally occurring or engineered composition, vector system, ordelivery systems as described herein for treating, preventing, oralleviating an autoimmune disorder. In certain embodiments, theinvention relates to the non-naturally occurring or engineeredcomposition, vector system, or delivery systems as described herein foruse in treating, preventing, or alleviating an autoimmune disorder. Incertain embodiments, the invention relates to a method for treating,preventing, or alleviating an autoimmune disorder comprising introducingor inducing the non-naturally occurring or engineered composition,vector system, or delivery systems as described herein. It is to beunderstood that preferably the CRISPR-Cas system targets a targetspecific for the cells responsible for the autoimmune disorder (e.g.specific immune cells).

Use of RNA-Targeting Effector Protein in RNA Detection

It is further envisaged that the RNA targeting effector protein can beused in Northern blot assays. Northern blotting involves the use ofelectrophoresis to separate RNA samples by size. The RNA targetingeffector protein can be used to specifically bind and detect the targetRNA sequence.

A RNA targeting effector protein can be fused to a fluorescent protein(such as GFP) and used to track RNA localization in living cells. Moreparticularly, the RNA targeting effector protein can be inactivated inthat it no longer cleaves RNA. In particular embodiments, it isenvisaged that a split RNA targeting effector protein can be used,whereby the signal is dependent on the binding of both subproteins, inorder to ensure a more precise visualization. Alternatively, a splitfluorescent protein can be used that is reconstituted when multiple RNAtargeting effector protein complexes bind to the target transcript. Itis further envisaged that a transcript is targeted at multiple bindingsites along the mRNA so the fluorescent signal can amplify the truesignal and allow for focal identification. As yet another alternative,the fluorescent protein can be reconstituted form a split intein.

RNA targeting effector proteins are for instance suitably used todetermine the localization of the RNA or specific splice variants, thelevel of mRNA transcript, up- or down-regulation of transcripts anddisease-specific diagnosis. The RNA targeting effector proteins can beused for visualization of RNA in (living) cells using e.g. fluorescentmicroscopy or flow cytometry, such as fluorescence-activated cellsorting (FACS) which allows for high-throughput screening of cells andrecovery of living cells following cell sorting. Further, expressionlevels of different transcripts can be assessed simultaneously understress, e.g. inhibition of cancer growth using molecular inhibitors orhypoxic conditions on cells. Another application would be to tracklocalization of transcripts to synaptic connections during a neuralstimulus using two photon microscopy.

In certain embodiments, the components or complexes according to theinvention as described herein can be used in multiplexed error-robustfluorescence in situ hybridization (MERFISH; Chen et al. Science; 2015;348(6233)), such as for instance with (fluorescently) labeled CRISPR-Caseffectors.

In Vitro Apex Labeling

Cellular processes depend on a network of molecular interactions amongprotein, RNA, and DNA. Accurate detection of protein—DNA and protein—RNAinteractions is key to understanding such processes. In vitro proximitylabeling technology employs an affinity tag combined with e.g. aphotoactivatable probe to label polypeptides and RNAs in the vicinity ofa protein or RNA of interest in vitro. After UV irradiation thephotoactivatable group reacts with proteins and other molecules that arein close proximity to the tagged molecule, thereby labelling them.Labelled interacting molecules can subsequently be recovered andidentified. The RNA targeting effector protein of the invention can forinstance be used to target a probe to a selected RNA sequence.

These applications could also be applied in animal models for in vivoimaging of disease relevant applications or difficult-to culture celltypes.

Use of RNA-Targeting Effector Protein in RNA Origami/In Vitro AssemblyLines—Combinatorics

RNA origami refers to nanoscale folded structures for creatingtwo-dimensional or three-dimensional structures using RNA as integratedtemplate. The folded structure is encoded in the RNA and the shape ofthe resulting RNA is thus determined by the synthesized RNA sequence(Geary, et al. 2014. Science, 345 (6198). pp. 799-804). The RNA origamimay act as scaffold for arranging other components, such as proteins,into complexes. The RNA targeting effector protein of the invention canfor instance be used to target proteins of interest to the RNA origamiusing a suitable guide RNA.

These applications could also be applied in animal models for in vivoimaging of disease relevant applications or difficult-to culture celltypes.

Use of RNA-Targeting Effector Protein in RNA Isolation or Purification,Enrichment or Depletion

It is further envisaging that the RNA targeting effector protein whencomplexed to RNA can be used to isolate and/or purify the RNA. The RNAtargeting effector protein can for instance be fused to an affinity tagthat can be used to isolate and/or purify the RNA-RNA targeting effectorprotein complex. Such applications are for instance useful in theanalysis of gene expression profiles in cells.

In particular embodiments, it can be envisaged that the RNA targetingeffector proteins can be used to target a specific noncoding RNA (ncRNA)thereby blocking its activity, providing a useful functional probe. Incertain embodiments, the effector protein as described herein may beused to specifically enrich for a particular RNA (including but notlimited to increasing stability, etc.), or alternatively to specificallydeplete a particular RNA (such as without limitation for instanceparticular splice variants, isoforms, etc.).

Interrogation of LincRNA Function and Other Nuclear RNAs

Current RNA knockdown strategies such as siRNA have the disadvantagethat they are mostly limited to targeting cytosolic transcripts sincethe protein machinery is cytosolic. The advantage of a RNA targetingeffector protein of the present invention, an exogenous system that isnot essential to cell function, is that it can be used in anycompartment in the cell. By fusing a NLS signal to the RNA targetingeffector protein, it can be guided to the nucleus, allowing nuclear RNAsto be targeted. It is for instance envisaged to probe the function oflincRNAs. Long intergenic non-coding RNAs (lincRNAs) are a vastlyunderexplored area of research. Most lincRNAs have as of yet unknownfunctions which could be studies using the RNA targeting effectorprotein of the invention.

Identification of RNA Binding Proteins

Identifying proteins bound to specific RNAs can be useful forunderstanding the roles of many RNAs. For instance, many lincRNAsassociate with transcriptional and epigenetic regulators to controltranscription. Understanding what proteins bind to a given lincRNA canhelp elucidate the components in a given regulatory pathway. A RNAtargeting effector protein of the invention can be designed to recruit abiotin ligase to a specific transcript in order to label locally boundproteins with biotin. The proteins can then be pulled down and analyzedby mass spectrometry to identify them.

Assembly of Complexes on RNA and Substrate Shuttling

RNA targeting effector proteins of the invention can further be used toassemble complexes on RNA. This can be achieved by functionalizing theRNA targeting effector protein with multiple related proteins (e.g.components of a particular synthesis pathway). Alternatively, multipleRNA targeting effector proteins can be functionalized with suchdifferent related proteins and targeted to the same or adjacent targetRNA. Useful application of assembling complexes on RNA are for instancefacilitating substrate shuttling between proteins.

Synthetic Biology

The development of biological systems has a wide utility, including inclinical applications. It is envisaged that the programmable RNAtargeting effector proteins of the invention can be used fused to splitproteins of toxic domains for targeted cell death, for instance usingcancer-linked RNA as target transcript. Further, pathways involvingprotein-protein interaction can be influenced in synthetic biologicalsystems with e.g. fusion complexes with the appropriate effectors suchas kinases or other enzymes.

Protein Splicing: Inteins

Protein splicing is a post-translational process in which an interveningpolypeptide, referred to as an intein, catalyzes its own excision fromthe polypeptides flacking it, referred to as exteins, as well assubsequent ligation of the exteins. The assembly of two or more RNAtargeting effector proteins as described herein on a target transcriptcould be used to direct the release of a split intein (Topilina andMills Mob DNA. 2014 Feb. 4; 5(1):5), thereby allowing for directcomputation of the existence of a mRNA transcript and subsequent releaseof a protein product, such as a metabolic enzyme or a transcriptionfactor (for downstream actuation of transcription pathways). Thisapplication may have significant relevance in synthetic biology (seeabove) or large-scale bioproduction (only produce product under certainconditions).

Inducible, Dosed and Self-Inactivating Systems

In one embodiment, fusion complexes comprising an RNA targeting effectorprotein of the invention and an effector component are designed to beinducible, for instance light inducible or chemically inducible. Suchinducibility allows for activation of the effector component at adesired moment in time.

Light inducibility is for instance achieved by designing a fusioncomplex wherein CRY2 PHR/CIBN pairing is used for fusion. This system isparticularly useful for light induction of protein interactions inliving cells (Konermann S, et al. Nature. 2013; 500:472-476).

Chemical inducibility is for instance provided for by designing a fusioncomplex wherein FKBP/FRB (FK506 binding protein/FKBP rapamycin binding)pairing is used for fusion. Using this system rapamycin is required forbinding of proteins (Zetsche et al. Nat Biotechnol. 2015; 33(2):139-42describes the use of this system for Cas9).

Further, when introduced in the cell as DNA, the RNA targeting effectorprotein of the inventions can be modulated by inducible promoters, suchas tetracycline or doxycycline controlled transcriptional activation(Tet-On and Tet-Off expression system), hormone inducible geneexpression system such as for instance an ecdysone inducible geneexpression system and an arabinose-inducible gene expression system.When delivered as RNA, expression of the RNA targeting effector proteincan be modulated via a riboswitch, which can sense a small molecule liketetracycline (as described in Goldfless et al. Nucleic Acids Res. 2012;40(9):e64).

In one embodiment, the delivery of the RNA targeting effector protein ofthe invention can be modulated to change the amount of protein or crRNAin the cell, thereby changing the magnitude of the desired effect or anyundesired off-target effects.

In one embodiment, the RNA targeting effector proteins described hereincan be designed to be self-inactivating. When delivered to a cell asRNA, either mRNA or as a replication RNA therapeutic (Wrobleska et alNat Biotechnol. 2015 August; 33(8): 839-841), they can self-inactivateexpression and subsequent effects by destroying the own RNA, therebyreducing residency and potential undesirable effects.

For further in vivo applications of RNA targeting effector proteins asdescribed herein, reference is made to Mackay J P et al (Nat Struct MolBiol. 2011 March; 18(3):256-61), Nelles et al (Bioessays. 2015 July;37(7):732-9) and Abil Z and Zhao H (Mol Biosyst. 2015 October;11(10):2658-65), which are incorporated herein by reference. Inparticular, the following applications are envisaged in certainembodiments of the invention, preferably in certain embodiments by usingcatalytically inactive CRISPR-Cas: enhancing translation (e.g.CRISPR-Cas-translation promotion factor fusions (e.g. eIF4 fusions));repressing translation (e.g. gRNA targeting ribosome binding sites);exon skipping (e.g. gRNAs targeting splice donor and/or acceptor sites);exon inclusion (e.g. gRNA targeting a particular exon splice donorand/or acceptor site to be included or CRISPR-Cas fused to or recruitingspliceosome components (e.g. U1 snRNA)); accessing RNA localization(e.g. CRISPR-Cas-marker fusions (e.g. EGFP fusions)); altering RNAlocalization (e.g. CRISPR-Cas-localization signal fusions (e.g. NLS orNES fusions)); RNA degradation (in this case no catalytically inactiveCRISPR-Cas is to be used if relied on the activity of CRISPR-Cas,alternatively and for increased specificity, a split CRISPR-Cas may beused); inhibition of non-coding RNA function (e.g. miRNA), such as bydegradation or binding of gRNA to functional sites (possibly titratingout at specific sites by relocalization by CRISPR-Cas-signal sequencefusions).

As described herein before and demonstrated in the Examples, CRISPR-Casfunction is robust to 5′ or 3′ extensions of the crRNA and to extensionof the crRNA loop. It is therefore envisaging that MS2 loops and otherrecruitment domains can be added to the crRNA without affecting complexformation and binding to target transcripts. Such modifications to thecrRNA for recruitment of various effector domains are applicable in theuses of a RNA targeted effector proteins described above.

CRISPR-Cas is capable of mediating resistance to RNA phages. It istherefore envisaged that CRISPR-Cas can be used to immunize, e.g.animals, humans and plants, against RNA-only pathogens, including butnot limited to Ebola virus and Zika virus.

In certain embodiments, CRISPR-Cas can process (cleave) its own array.This applies to both the wildtype CRISPR-Cas protein and the mutatedCRISPR-Cas protein containing one or more mutated amino acid residues asherein-discussed. It is therefore envisaged that multiple crRNAsdesigned for different target transcripts and/or applications can bedelivered as a single pre-crRNA or as a single transcript driven by onepromotor. Such method of delivery has the advantages that it issubstantially more compact, easier to synthesize and easier to deliveryin viral systems. It will be understood that exact amino acid positionsmay vary for orthologues of a herein CRISPR-Cas can be adequatelydetermined by protein alignment, as is known in the art, and asdescribed herein elsewhere. Aspects of the invention also encompassmethods and uses of the compositions and systems described herein ingenome engineering, e.g. for altering or manipulating the expression ofone or more genes or the one or more gene products, in prokaryotic oreukaryotic cells, in vitro, in vivo or ex vivo.

In an aspect, the invention provides methods and compositions formodulating, e.g., reducing, expression of a target RNA in cells. In thesubject methods, a CRISPR-Cas system of the invention is provided thatinterferes with transcription, stability, and/or translation of an RNA.

In certain embodiments, an effective amount of CRISPR-Cas system is usedto cleave RNA or otherwise inhibit RNA expression. In this regard, thesystem has uses similar to siRNA and shRNA, thus can also be substitutedfor such methods. The method includes, without limitation, use of aCRISPR-Cas system as a substitute for e.g., an interfering ribonucleicacid (such as an siRNA or shRNA) or a transcription template thereof,e.g., a DNA encoding an shRNA. The CRISPR-Cas system is introduced intoa target cell, e.g., by being administered to a mammal that includes thetarget cell.

Advantageously, a CRISPR-Cas system of the invention is specific. Forexample, whereas interfering ribonucleic acid (such as an siRNA orshRNA) polynucleotide systems are plagued by design and stability issuesand off-target binding, a CRISPR-Cas system of the invention can bedesigned with high specificity.

In an aspect of the invention, novel RNA targeting systems also referredto as RNA- or RNA-targeting CRISPR systems of the present applicationare based on herein-identified CRISPR-Cas proteins which do not requirethe generation of customized proteins to target specific RNA sequencesbut rather a single enzyme can be programmed by a RNA molecule torecognize a specific RNA target, in other words the enzyme can berecruited to a specific RNA target using said RNA molecule.

In some embodiments, one or more elements of a nucleic acid-targetingsystem is derived from a particular organism comprising an endogenousCRISPR RNA-targeting system. In certain embodiments, the CRISPRRNA-targeting system is found in Eubacterium and Ruminococcus. Incertain embodiments, the effector protein comprises targeted andcollateral ssRNA cleavage activity. In certain embodiments, the effectorprotein comprises dual HEPN domains. In certain embodiments, theeffector protein lacks a counterpart to the Helical-1 domain of Cas13a.In certain embodiments, the effector protein is smaller than previouslycharacterized class 2 CRISPR effectors, with a median size of 928 aa.This median size is 190 aa (17%) less than that of Cas13c, more than 200aa (18%) less than that of Cas13b, and more than 300 aa (26%) less thanthat of Cas13a. In certain embodiments, the effector protein has norequirement for a flanking sequence (e.g., PFS, PAM).

In certain embodiments, the effector protein locus structures include aWYL domain containing accessory protein (so denoted after three aminoacids that were conserved in the originally identified group of thesedomains; see, e.g., WYL domain IPR026881). In certain embodiments, theWYL domain accessory protein comprises at least one helix-turn-helix(HTH) or ribbon-helix-helix (RHH) DNA-binding domain. In certainembodiments, the WYL domain containing accessory protein increases boththe targeted and the collateral ssRNA cleavage activity of theRNA-targeting effector protein. In certain embodiments, the WYL domaincontaining accessory protein comprises an N-terminal RHH domain, as wellas a pattern of primarily hydrophobic conserved residues, including aninvariant tyrosine-leucine doublet corresponding to the original WYLmotif. In certain embodiments, the WYL domain containing accessoryprotein is WYL1. WYL1 is a single WYL-domain protein associatedprimarily with Ruminococcus.

In other example embodiments, the Type VI RNA-targeting Cas enzyme isCas 13d. In certain embodiments, Cas13d is Eubacterium siraeum DSM 15702(EsCas13d) or Ruminococcus sp. N15.MGS-57 (RspCas13d) (see, e.g., Yan etal., Cas13d Is a Compact RNA-Targeting Type VI CRISPR EffectorPositively Modulated by a WYL-Domain-Containing Accessory Protein,Molecular Cell (2018), doi. org/10.1016/j.molcel.2018.02.028). RspCas13dand EsCas13d have no flanking sequence requirements (e.g., PFS, PAM).

Application of the CRISPR-Cas Proteins in Optimized Functional RNATargeting Systems

In an aspect the invention provides a system for specific delivery offunctional components to the RNA environment. This can be ensured usingthe CRISPR systems comprising the RNA targeting effector proteins of thepresent invention which allow specific targeting of different componentsto RNA. More particularly such components include activators orrepressors, such as activators or repressors of RNA translation,degradation, etc. Applications of this system are described elsewhereherein.

According to one aspect the invention provides non-naturally occurringor engineered composition comprising a guide RNA comprising a guidesequence capable of hybridizing to a target sequence in a genomic locusof interest in a cell, wherein the guide RNA is modified by theinsertion of one or more distinct RNA sequence(s) that bind an adaptorprotein. In particular embodiments, the RNA sequences may bind to two ormore adaptor proteins (e.g. aptamers), and wherein each adaptor proteinis associated with one or more functional domains. The guide RNAs of theCRISPR-Cas enzymes described herein are shown to be amenable tomodification of the guide sequence. In particular embodiments, the guideRNA is modified by the insertion of distinct RNA sequence(s) 5′ of thedirect repeat, within the direct repeat, or 3′ of the guide sequence.When there is more than one functional domain, the functional domainscan be same or different, e.g., two of the same or two differentactivators or repressors. In an aspect the invention provides aherein-discussed composition, wherein the one or more functional domainsare attached to the RNA targeting enzyme so that upon binding to thetarget RNA the functional domain is in a spatial orientation allowingfor the functional domain to function in its attributed function; In anaspect the invention provides a herein-discussed composition, whereinthe composition comprises a CRISPR-Cas complex having at least threefunctional domains, at least one of which is associated with the RNAtargeting enzyme and at least two of which are associated with the gRNA.

Accordingly, in an aspect the invention provides non-naturally occurringor engineered CRISPR-Cas complex composition comprising the guide RNA asherein-discussed and a CRISPR-Cas which is an RNA targeting enzyme,wherein optionally the RNA targeting enzyme comprises at least onemutation, such that the RNA targeting enzyme has no more than 5% of thenuclease activity of the enzyme not having the at least one mutation,and optionally one or more comprising at least one or more nuclearlocalization sequences. In particular embodiments, the guide RNA isadditionally or alternatively modified so as to still ensure binding ofthe RNA targeting enzyme but to prevent cleavage by the RNA targetingenzyme (as detailed elsewhere herein).

In particular embodiments, the RNA targeting enzyme is a CRISPR-Casprotein which has a diminished nuclease activity of at least 97%, or100% as compared with the CRISPR-Cas enzyme not having the at least onemutation. In an aspect the invention provides a herein-discussedcomposition, wherein the CRISPR-Cas enzyme comprises two or moremutations as otherwise herein-discussed.

In particular embodiments, an RNA targeting system is provided asdescribed herein above comprising two or more functional domains. Inparticular embodiments, the two or more functional domains areheterologous functional domain. In particular embodiments, the systemcomprises an adaptor protein which is a fusion protein comprising afunctional domain, the fusion protein optionally comprising a linkerbetween the adaptor protein and the functional domain. In particularembodiments, the linker includes a GlySer linker. Additionally oralternatively, one or more functional domains are attached to the RNAeffector protein by way of a linker, optionally a GlySer linker. Inparticular embodiments, the one or more functional domains are attachedto the RNA targeting enzyme through one or both of the HEPN domains.

In an aspect the invention provides a herein-discussed composition,wherein the one or more functional domains associated with the adaptorprotein or the RNA targeting enzyme is a domain capable of activating orrepressing RNA translation. In an aspect the invention provides aherein-discussed composition, wherein at least one of the one or morefunctional domains associated with the adaptor protein have one or moreactivities comprising methylase activity, demethylase activity,transcription activation activity, transcription repression activity,transcription release factor activity, histone modification activity,DNA integration activity RNA cleavage activity, DNA cleavage activity ornucleic acid binding activity, or molecular switch activity or chemicalinducibility or light inducibility.

In an aspect the invention provides a herein-discussed compositioncomprising an aptamer sequence. In particular embodiments, the aptamersequence is two or more aptamer sequences specific to the same adaptorprotein. In an aspect the invention provides a herein-discussedcomposition, wherein the aptamer sequence is two or more aptamersequences specific to different adaptor protein. In an aspect theinvention provides a herein-discussed composition, wherein the adaptorprotein comprises MS2, PP7, Qβ, F2, GA, fr, JP501, M12, R17, BZ13, JP34,JP500, KU1, M11, MX1, TW18, VK, SP, FI, ID2, NL95, TW19, AP205, ϕCb5,ϕCb8r, ϕCb12r, ϕCb23r, 7s, PRR1. Accordingly, in particular embodiments,the aptamer is selected from a binding protein specifically binding anyone of the adaptor proteins listed above. In an aspect the inventionprovides a herein-discussed composition, wherein the cell is aeukaryotic cell. In an aspect the invention provides a herein-discussedcomposition, wherein the eukaryotic cell is a mammalian cell, a plantcell or a yeast cell, whereby the mammalian cell is optionally a mousecell. In an aspect the invention provides a herein-discussedcomposition, wherein the mammalian cell is a human cell.

In an aspect the invention provides a herein above-discussed compositionwherein there is more than one guide RNA or gRNA or crRNA, and thesetarget different sequences whereby when the composition is employed,there is multiplexing. In an aspect the invention provides a compositionwherein there is more than one guide RNA or gRNA or crRNA modified bythe insertion of distinct RNA sequence(s) that bind to one or moreadaptor proteins.

In an aspect the invention provides a herein-discussed compositionwherein one or more adaptor proteins associated with one or morefunctional domains is present and bound to the distinct RNA sequence(s)inserted into the guide RNA(s).

In an aspect the invention provides a herein-discussed compositionwherein the guide RNA is modified to have at least one non-codingfunctional loop; e.g., wherein the at least one non-coding functionalloop is repressive; for instance, wherein at least one non-codingfunctional loop comprises Alu.

In an aspect the invention provides a method for modifying geneexpression comprising the administration to a host or expression in ahost in vivo of one or more of the compositions as herein-discussed.

In an aspect the invention provides a herein-discussed method comprisingthe delivery of the composition or nucleic acid molecule(s) codingtherefor, wherein said nucleic acid molecule(s) are operatively linkedto regulatory sequence(s) and expressed in vivo. In an aspect theinvention provides a herein-discussed method wherein the expression invivo is via a lentivirus, an adenovirus, or an AAV.

In an aspect the invention provides a mammalian cell line of cells asherein-discussed, wherein the cell line is, optionally, a human cellline or a mouse cell line. In an aspect the invention provides atransgenic mammalian model, optionally a mouse, wherein the model hasbeen transformed with a herein-discussed composition or is a progeny ofsaid transformant.

In an aspect the invention provides a nucleic acid molecule(s) encodingguide RNA or the RNA targeting CRISPR-Cas complex or the composition asherein-discussed. In an aspect the invention provides a vectorcomprising: a nucleic acid molecule encoding a guide RNA (gRNA) or crRNAcomprising a guide sequence capable of hybridizing to an RNA targetsequence in a cell, wherein the direct repeat of the gRNA or crRNA ismodified by the insertion of distinct RNA sequence(s) that bind(s) totwo or more adaptor proteins, and wherein each adaptor protein isassociated with one or more functional domains; or, wherein the gRNA ismodified to have at least one non-coding functional loop. In an aspectthe invention provides vector(s) comprising nucleic acid molecule(s)encoding: non-naturally occurring or engineered CRISPR-Cas complexcomposition comprising the gRNA or crRNA herein-discussed, and an RNAtargeting enzyme, wherein optionally the RNA targeting enzyme comprisesat least one mutation, such that the RNA targeting enzyme has no morethan 5% of the nuclease activity of the RNA targeting enzyme not havingthe at least one mutation, and optionally one or more comprising atleast one or more nuclear localization sequences. In an aspect a vectorcan further comprise regulatory element(s) operable in a eukaryotic celloperably linked to the nucleic acid molecule encoding the guide RNA(gRNA) or crRNA and/or the nucleic acid molecule encoding the RNAtargeting enzyme and/or the optional nuclear localization sequence(s).

In one aspect, the invention provides a kit comprising one or more ofthe components described herein. In some embodiments, the kit comprisesa vector system as described herein and instructions for using the kit.

In an aspect the invention provides a method of screening for gain offunction (GOF) or loss of function (LOF) or for screening non-codingRNAs or potential regulatory regions (e.g. enhancers, repressors)comprising the cell line of as herein-discussed or cells of the modelherein-discussed containing or expressing the RNA targeting enzyme andintroducing a composition as herein-discussed into cells of the cellline or model, whereby the gRNA or crRNA includes either an activator ora repressor, and monitoring for GOF or LOF respectively as to thosecells as to which the introduced gRNA or crRNA includes an activator oras to those cells as to which the introduced gRNA or crRNA includes arepressor.

In an aspect the invention provides a library of non-naturally occurringor engineered compositions, each comprising a RNA targeting CRISPR guideRNA (gRNA) or crRNA comprising a guide sequence capable of hybridizingto a target RNA sequence of interest in a cell, an RNA targeting enzyme,wherein the RNA targeting enzyme comprises at least one mutation, suchthat the RNA targeting enzyme has no more than 5% of the nucleaseactivity of the RNA targeting enzyme not having the at least onemutation, wherein the gRNA or crRNA is modified by the insertion ofdistinct RNA sequence(s) that bind to one or more adaptor proteins, andwherein the adaptor protein is associated with one or more functionaldomains, wherein the composition comprises one or more or two or moreadaptor proteins, wherein the each protein is associated with one ormore functional domains, and wherein the gRNAs or crRNAs comprise agenome wide library comprising a plurality of RNA targeting guide RNAs(gRNAs) or crRNAs. In an aspect the invention provides a library asherein-discussed, wherein the RNA targeting RNA targeting enzyme has adiminished nuclease activity of at least 97%, or 100% as compare withthe RNA targeting enzyme not having the at least one mutation. In anaspect the invention provides a library as herein-discussed, wherein theadaptor protein is a fusion protein comprising the functional domain. Inan aspect the invention provides a library as herein discussed, whereinthe gRNA or crRNA is not modified by the insertion of distinct RNAsequence(s) that bind to the one or two or more adaptor proteins. In anaspect the invention provides a library as herein discussed, wherein theone or two or more functional domains are associated with the RNAtargeting enzyme. In an aspect the invention provides a library asherein discussed, wherein the cell population of cells is a populationof eukaryotic cells. In an aspect the invention provides a library asherein discussed, wherein the eukaryotic cell is a mammalian cell, aplant cell or a yeast cell. In an aspect the invention provides alibrary as herein discussed, wherein the mammalian cell is a human cell.In an aspect the invention provides a library as herein discussed,wherein the population of cells is a population of embryonic stem (ES)cells.

In an aspect the invention provides a library as herein discussed,wherein the targeting is of about 100 or more RNA sequences. In anaspect the invention provides a library as herein discussed, wherein thetargeting is of about 1000 or more RNA sequences. In an aspect theinvention provides a library as herein discussed, wherein the targetingis of about 20,000 or more sequences. In an aspect the inventionprovides a library as herein discussed, wherein the targeting is of theentire transcriptome. In an aspect the invention provides a library asherein discussed, wherein the targeting is of a panel of targetsequences focused on a relevant or desirable pathway. In an aspect theinvention provides a library as herein discussed, wherein the pathway isan immune pathway. In an aspect the invention provides a library asherein discussed, wherein the pathway is a cell division pathway.

In one aspect, the invention provides a method of generating a modeleukaryotic cell comprising a gene with modified expression. In someembodiments, a disease gene is any gene associated an increase in therisk of having or developing a disease. In some embodiments, the methodcomprises (a) introducing one or more vectors encoding the components ofthe system described herein above into a eukaryotic cell, and (b)allowing a CRISPR complex to bind to a target polynucleotide so as tomodify expression of a gene, thereby generating a model eukaryotic cellcomprising modified gene expression.

The structural information provided herein allows for interrogation ofguide RNA or crRNA interaction with the target RNA and the RNA targetingenzyme permitting engineering or alteration of guide RNA structure tooptimize functionality of the entire RNA targeting CRISPR-Cas system.For example, the guide RNA or crRNA may be extended, without collidingwith the RNA targeting protein by the insertion of adaptor proteins thatcan bind to RNA. These adaptor proteins can further recruit effectorproteins or fusions which comprise one or more functional domains.

An aspect of the invention is that the above elements are comprised in asingle composition or comprised in individual compositions. Thesecompositions may advantageously be applied to a host to elicit afunctional effect on the genomic level.

The skilled person will understand that modifications to the guide RNAor crRNA which allow for binding of the adapter+functional domain butnot proper positioning of the adapter+functional domain (e.g. due tosteric hindrance within the three dimension structure of the CRISPR-Cascomplex) are modifications which are not intended. The one or moremodified guide RNA or crRNA may be modified, by introduction of adistinct RNA sequence(s) 5′ of the direct repeat, within the directrepeat, or 3′ of the guide sequence.

The modified guide RNA or crRNA, the inactivated RNA targeting enzyme(with or without functional domains), and the binding protein with oneor more functional domains, may each individually be comprised in acomposition and administered to a host individually or collectively.Alternatively, these components may be provided in a single compositionfor administration to a host. Administration to a host may be performedvia viral vectors known to the skilled person or described herein fordelivery to a host (e.g. lentiviral vector, adenoviral vector, AAVvector). As explained herein, use of different selection markers (e.g.for lentiviral gRNA or crRNA selection) and concentration of gRNA orcrRNA (e.g. dependent on whether multiple gRNAs or crRNAs are used) maybe advantageous for eliciting an improved effect.

Using the provided compositions, the person skilled in the art canadvantageously and specifically target single or multiple loci with thesame or different functional domains to elicit one or more genomicevents. The compositions may be applied in a wide variety of methods forscreening in libraries in cells and functional modeling in vivo (e.g.gene activation of lincRNA and identification of function;gain-of-function modeling; loss-of-function modeling; the use thecompositions of the invention to establish cell lines and transgenicanimals for optimization and screening purposes).

The current invention comprehends the use of the compositions of thecurrent invention to establish and utilize conditional or inducibleCRISPR-Cas RNA targeting events. (See, e.g., Platt et al., Cell (2014),dx.doi.org/10.1016/j.cell.2014.09.014, or PCT patent publications citedherein, such as WO 2014/093622 (PCT/US2013/074667), which are notbelieved prior to the present invention or application).

Delivery of Functional Effectors

CRISPR-Cas13 knockdown allows for temporary reduction of gene expressionthrough the use of artificial transcription factors, e.g., via mutatingresidues in cleavage domain(s) of the Cas13 protein results in thegeneration of a catalytically inactive Cas13 protein. A catalyticallyinactive Cas13 complexes with a guide RNA or crRNA and localizes to theRNA sequence specified by that guide RNA's or crRNA's targeting domain,however, it does not cleave the target. Fusion of the inactive Cas13protein to an effector domain, e.g., a transcription repression domain,enables recruitment of the effector to any site specified by the guideRNA.

Optimized Functional RNA Targeting Systems

In an aspect the invention thus provides a system for specific deliveryof functional components to the RNA environment. This can be ensuredusing the CRISPR systems comprising the RNA targeting effector proteinsof the present invention which allow specific targeting of differentcomponents to RNA. More particularly such components include activatorsor repressors, such as activators or repressors of RNA translation,degradation, etc.

According to one aspect the invention provides non-naturally occurringor engineered composition comprising a guide RNA or crRNA comprising aguide sequence capable of hybridizing to a target sequence of interestin a cell, wherein the guide RNA or crRNA is modified by the insertionof one or more distinct RNA sequence(s) that bind an adaptor protein. Inparticular embodiments, the RNA sequences may bind to two or moreadaptor proteins (e.g. aptamers), and wherein each adaptor protein isassociated with one or more functional domains. When there is more thanone functional domain, the functional domains can be same or different,e.g., two of the same or two different activators or repressors. In anaspect the invention provides a herein-discussed composition, whereinthe one or more functional domains are attached to the RNA targetingenzyme so that upon binding to the target RNA the functional domain isin a spatial orientation allowing for the functional domain to functionin its attributed function; In an aspect the invention provides aherein-discussed composition, wherein the composition comprises aCRISPR-Cas13 complex having at least three functional domains, at leastone of which is associated with the RNA targeting enzyme and at leasttwo of which are associated with the gRNA or crRNA.

Application of RNA Targeting-CRISPR System to Plants and YeastDefinitions:

In general, the term “plant” relates to any various photosynthetic,eukaryotic, unicellular or multicellular organism of the kingdom Plantaecharacteristically growing by cell division, containing chloroplasts,and having cell walls comprised of cellulose. The term plant encompassesmonocotyledonous and dicotyledonous plants. Specifically, the plants areintended to comprise without limitation angiosperm and gymnosperm plantssuch as acacia, alfalfa, amaranth, apple, apricot, artichoke, ash tree,asparagus, avocado, banana, barley, beans, beet, birch, beech,blackberry, blueberry, broccoli, Brussel's sprouts, cabbage, canola,cantaloupe, carrot, cassava, cauliflower, cedar, a cereal, celery,chestnut, cherry, Chinese cabbage, citrus, clementine, clover, coffee,corn, cotton, cowpea, cucumber, cypress, eggplant, elm, endive,eucalyptus, fennel, figs, fir, geranium, grape, grapefruit, groundnuts,ground cherry, gum hemlock, hickory, kale, kiwifruit, kohlrabi, larch,lettuce, leek, lemon, lime, locust, pine, maidenhair, maize, mango,maple, melon, millet, mushroom, mustard, nuts, oak, oats, oil palm,okra, onion, orange, an ornamental plant or flower or tree, papaya,palm, parsley, parsnip, pea, peach, peanut, pear, peat, pepper,persimmon, pigeon pea, pine, pineapple, plantain, plum, pomegranate,potato, pumpkin, radicchio, radish, rapeseed, raspberry, rice, rye,sorghum, safflower, sallow, soybean, spinach, spruce, squash,strawberry, sugar beet, sugarcane, sunflower, sweet potato, sweet corn,tangerine, tea, tobacco, tomato, trees, triticale, turf grasses,turnips, vine, walnut, watercress, watermelon, wheat, yams, yew, andzucchini. The term plant also encompasses Algae, which are mainlyphotoautotrophs unified primarily by their lack of roots, leaves andother organs that characterize higher plants.

The methods for modulating gene expression using the RNA targetingsystem as described herein can be used to confer desired traits onessentially any plant. A wide variety of plants and plant cell systemsmay be engineered for the desired physiological and agronomiccharacteristics described herein using the nucleic acid constructs ofthe present disclosure and the various transformation methods mentionedabove. In preferred embodiments, target plants and plant cells forengineering include, but are not limited to, those monocotyledonous anddicotyledonous plants, such as crops including grain crops (e.g., wheat,maize, rice, millet, barley), fruit crops (e.g., tomato, apple, pear,strawberry, orange), forage crops (e.g., alfalfa), root vegetable crops(e.g., carrot, potato, sugar beets, yam), leafy vegetable crops (e.g.,lettuce, spinach); flowering plants (e.g., petunia, rose,chrysanthemum), conifers and pine trees (e.g., pine fir, spruce); plantsused in phytoremediation (e.g., heavy metal accumulating plants); oilcrops (e.g., sunflower, rape seed) and plants used for experimentalpurposes (e.g., Arabidopsis). Thus, the methods and CRISPR-Cas systemscan be used over a broad range of plants, such as for example withdicotyledonous plants belonging to the orders Magniolales, Illiciales,Laurales, Piperales, Aristochiales, Nymphaeales, Ranunculales,Papeverales, Sarraceniaceae, Trochodendrales, Hamamelidales, Eucomiales,Leitneriales, Myricales, Fagales, Casuarinales, Caryophyllales, Batales,Polygonales, Plumbaginales, Dilleniales, Theales, Malvales, Urticales,Lecythidales, Violales, Salicales, Capparales, Ericales, Diapensales,Ebenales, Primulales, Rosales, Fabales, Podostemales, Haloragales,Myrtales, Cornales, Proteales, San tales, Rafflesiales, Celastrales,Euphorbiales, Rhamnales, Sapindales, Juglandales, Geraniales,Polygalales, Umbellales, Gentianales, Polemoniales, Lamiales,Plantaginales, Scrophulariales, Campanulales, Rubiales, Dipsacales, andAsterales; the methods and CRISPR-Cas systems can be used withmonocotyledonous plants such as those belonging to the ordersAlismatales, Hydrocharitales, Najadales, Triuridales, Commelinales,Eriocaulales, Restionales, Poales, Juncales, Cyperales, Typhales,Bromeliales, Zingiberales, Arecales, Cyclanthales, Pandanales, Arales,Lilliales, and Orchid ales, or with plants belonging to Gymnospermae,e.g those belonging to the orders Pinales, Ginkgoales, Cycadales,Araucariales, Cupressales and Gnetales.

The RNA targeting CRISPR systems and methods of use described herein canbe used over a broad range of plant species, included in thenon-limitative list of dicot, monocot or gymnosperm genera hereunder:Atropa, Alseodaphne, Anacardium, Arachis, Beilschmiedia, Brassica,Carthamus, Cocculus, Croton, Cucumis, Citrus, Citrullus, Capsicum,Catharanthus, Cocos, Coffea, Cucurbita, Daucus, Duguetia, Eschscholzia,Ficus, Fragaria, Glaucium, Glycine, Gossypium, Helianthus, Hevea,Hyoscyamus, Lactuca, Landolphia, Linum, Litsea, Lycopersicon, Lupinus,Manihot, Majorana, Malus, Medicago, Nicotiana, Olea, Parthenium,Papaver, Persea, Phaseolus, Pistacia, Pisum, Pyrus, Prunus, Raphanus,Ricinus, Senecio, Sinomenium, Stephania, Sinapis, Solanum, Theobroma,Trifolium, Trigonella, Vicia, Vinca, Vilis, and Vigna; and the generaAllium, Andropogon, Aragrostis, Asparagus, Avena, Cynodon, Elaeis,Festuca, Festulolium, Heterocallis, Hordeum, Lemna, Lolium, Musa, Oryza,Panicum, Pannesetum, Phleum, Poa, Secale, Sorghum, Triticum, Zea, Abies,Cunninghamia, Ephedra, Picea, Pinus, and Pseudotsuga.

The RNA targeting CRISPR systems and methods of use can also be usedover a broad range of “algae” or “algae cells”; including for examplealgae selected from several eukaryotic phyla, including the Rhodophyta(red algae), Chlorophyta (green algae), Phaeophyta (brown algae),Bacillariophyta (diatoms), Eustigmatophyta and dinoflagellates as wellas the prokaryotic phylum Cyanobacteria (blue-green algae). The term“algae” includes for example algae selected from: Amphora, Anabaena,Anikstrodesmis, Botryococcus, Chaetoceros, Chlamydomonas, Chlorella,Chlorococcum, Cyclotella, Cylindrotheca, Dunaliella, Emiliana, Euglena,Hematococcus, Isochrysis, Monochrysis, Monoraphidium, Nannochloris,Nannnochloropsis, Navicula, Nephrochloris, Nephroselmis, Nitzschia,Nodularia, Nostoc, Oochromonas, Oocystis, Oscillartoria, Pavlova,Phaeodactylum, Playtmonas, Pleurochrysis, Porhyra, Pseudoanabaena,Pyramimonas, Stichococcus, Synechococcus, Synechocystis, Tetraselmis,Thalassiosira, and Trichodesmium.

A part of a plant, i.e., a “plant tissue” may be treated according tothe methods of the present invention to produce an improved plant. Planttissue also encompasses plant cells. The term “plant cell” as usedherein refers to individual units of a living plant, either in an intactwhole plant or in an isolated form grown in in vitro tissue cultures, onmedia or agar, in suspension in a growth media or buffer or as a part ofhigher organized unites, such as, for example, plant tissue, a plantorgan, or a whole plant.

A “protoplast” refers to a plant cell that has had its protective cellwall completely or partially removed using, for example, mechanical orenzymatic means resulting in an intact biochemical competent unit ofliving plant that can reform their cell wall, proliferate and regenerategrow into a whole plant under proper growing conditions.

The term “transformation” broadly refers to the process by which a planthost is genetically modified by the introduction of DNA by means ofAgrobacteria or one of a variety of chemical or physical methods. Asused herein, the term “plant host” refers to plants, including anycells, tissues, organs, or progeny of the plants. Many suitable planttissues or plant cells can be transformed and include, but are notlimited to, protoplasts, somatic embryos, pollen, leaves, seedlings,stems, calli, stolons, microtubers, and shoots. A plant tissue alsorefers to any clone of such a plant, seed, progeny, propagule whethergenerated sexually or asexually, and descendants of any of these, suchas cuttings or seed.

The term “transformed” as used herein, refers to a cell, tissue, organ,or organism into which a foreign DNA molecule, such as a construct, hasbeen introduced. The introduced DNA molecule may be integrated into thegenomic DNA of the recipient cell, tissue, organ, or organism such thatthe introduced DNA molecule is transmitted to the subsequent progeny. Inthese embodiments, the “transformed” or “transgenic” cell or plant mayalso include progeny of the cell or plant and progeny produced from abreeding program employing such a transformed plant as a parent in across and exhibiting an altered phenotype resulting from the presence ofthe introduced DNA molecule. Preferably, the transgenic plant is fertileand capable of transmitting the introduced DNA to progeny through sexualreproduction.

The term “progeny”, such as the progeny of a transgenic plant, is onethat is born of, begotten by, or derived from a plant or the transgenicplant. The introduced DNA molecule may also be transiently introducedinto the recipient cell such that the introduced DNA molecule is notinherited by subsequent progeny and thus not considered “transgenic”.Accordingly, as used herein, a “non-transgenic” plant or plant cell is aplant which does not contain a foreign DNA stably integrated into itsgenome.

The term “plant promoter” as used herein is a promoter capable ofinitiating transcription in plant cells, whether or not its origin is aplant cell. Exemplary suitable plant promoters include, but are notlimited to, those that are obtained from plants, plant viruses, andbacteria such as Agrobacterium or Rhizobium which comprise genesexpressed in plant cells.

As used herein, a “fungal cell” refers to any type of eukaryotic cellwithin the kingdom of fungi. Phyla within the kingdom of fungi includeAscomycota, Basidiomycota, Blastocladiomycota, Chytridiomycota,Glomeromycota, Microsporidia, and Neocallimastigomycota. Fungal cellsmay include yeasts, molds, and filamentous fungi. In some embodiments,the fungal cell is a yeast cell.

As used herein, the term “yeast cell” refers to any fungal cell withinthe phyla Ascomycota and Basidiomycota. Yeast cells may include buddingyeast cells, fission yeast cells, and mold cells. Without being limitedto these organisms, many types of yeast used in laboratory andindustrial settings are part of the phylum Ascomycota. In someembodiments, the yeast cell is an S. cerervisiae, Kluyveromycesmarxianus, or Issatchenkia orientalis cell. Other yeast cells mayinclude without limitation Candida spp. (e.g., Candida albicans),Yarrowia spp. (e.g., Yarrowia lipolytica), Pichia spp. (e.g., Pichiapastoris), Kluyveromyces spp. (e.g., Kluyveromyces lactis andKluyveromyces marxianus), Neurospora spp. (e.g., Neurospora crassa),Fusarium spp. (e.g., Fusarium oxysporum), and Issatchenkia spp. (e.g.,Issatchenkia orientalis, a.k.a. Pichia kudriavzevii and Candidaacidothermophilum). In some embodiments, the fungal cell is afilamentous fungal cell. As used herein, the term “filamentous fungalcell” refers to any type of fungal cell that grows in filaments, i.e.,hyphae or mycelia. Examples of filamentous fungal cells may includewithout limitation Aspergillus spp. (e.g., Aspergillus niger),Trichoderma spp. (e.g., Trichoderma reesei), Rhizopus spp. (e.g.,Rhizopus oryzae), and Mortierella spp. (e.g., Mortierella isabellina).

In some embodiments, the fungal cell is an industrial strain. As usedherein, “industrial strain” refers to any strain of fungal cell used inor isolated from an industrial process, e.g., production of a product ona commercial or industrial scale. Industrial strain may refer to afungal species that is typically used in an industrial process, or itmay refer to an isolate of a fungal species that may be also used fornon-industrial purposes (e.g., laboratory research). Examples ofindustrial processes may include fermentation (e.g., in production offood or beverage products), distillation, biofuel production, productionof a compound, and production of a polypeptide. Examples of industrialstrains may include, without limitation, JAY270 and ATCC4124.

In some embodiments, the fungal cell is a polyploid cell. As usedherein, a “polyploid” cell may refer to any cell whose genome is presentin more than one copy. A polyploid cell may refer to a type of cell thatis naturally found in a polyploid state, or it may refer to a cell thathas been induced to exist in a polyploid state (e.g., through specificregulation, alteration, inactivation, activation, or modification ofmeiosis, cytokinesis, or DNA replication). A polyploid cell may refer toa cell whose entire genome is polyploid, or it may refer to a cell thatis polyploid in a particular genomic locus of interest. Without wishingto be bound to theory, it is thought that the abundance of guide RNA maymore often be a rate-limiting component in genome engineering ofpolyploid cells than in haploid cells, and thus the methods using theCRISPR-Cas CRISPR system described herein may take advantage of using acertain fungal cell type.

In some embodiments, the fungal cell is a diploid cell. As used herein,a “diploid” cell may refer to any cell whose genome is present in twocopies. A diploid cell may refer to a type of cell that is naturallyfound in a diploid state, or it may refer to a cell that has beeninduced to exist in a diploid state (e.g., through specific regulation,alteration, inactivation, activation, or modification of meiosis,cytokinesis, or DNA replication). For example, the S. cerevisiae strainS228C may be maintained in a haploid or diploid state. A diploid cellmay refer to a cell whose entire genome is diploid, or it may refer to acell that is diploid in a particular genomic locus of interest. In someembodiments, the fungal cell is a haploid cell. As used herein, a“haploid” cell may refer to any cell whose genome is present in onecopy. A haploid cell may refer to a type of cell that is naturally foundin a haploid state, or it may refer to a cell that has been induced toexist in a haploid state (e.g., through specific regulation, alteration,inactivation, activation, or modification of meiosis, cytokinesis, orDNA replication). For example, the S. cerevisiae strain S228C may bemaintained in a haploid or diploid state. A haploid cell may refer to acell whose entire genome is haploid, or it may refer to a cell that ishaploid in a particular genomic locus of interest.

As used herein, a “yeast expression vector” refers to a nucleic acidthat contains one or more sequences encoding an RNA and/or polypeptideand may further contain any desired elements that control the expressionof the nucleic acid(s), as well as any elements that enable thereplication and maintenance of the expression vector inside the yeastcell. Many suitable yeast expression vectors and features thereof areknown in the art; for example, various vectors and techniques areillustrated in in Yeast Protocols, 2nd edition, Xiao, W., ed. (HumanaPress, New York, 2007) and Buckholz, R. G. and Gleeson, M.A. (1991)Biotechnology (NY) 9(11): 1067-72. Yeast vectors may contain, withoutlimitation, a centromeric (CEN) sequence, an autonomous replicationsequence (ARS), a promoter, such as an RNA Polymerase III promoter,operably linked to a sequence or gene of interest, a terminator such asan RNA polymerase III terminator, an origin of replication, and a markergene (e.g., auxotrophic, antibiotic, or other selectable markers).Examples of expression vectors for use in yeast may include plasmids,yeast artificial chromosomes, 2μ plasmids, yeast integrative plasmids,yeast replicative plasmids, shuttle vectors, and episomal plasmids.

Stable Integration of RNA Targeting Crisp System Components in theGenome of Plants and Plant Cells

In particular embodiments, it is envisaged that the polynucleotidesencoding the components of the RNA targeting CRISPR system areintroduced for stable integration into the genome of a plant cell. Inthese embodiments, the design of the transformation vector or theexpression system can be adjusted depending on when, where and underwhat conditions the guide RNA and/or the RNA targeting gene(s) areexpressed.

In particular embodiments, it is envisaged to introduce the componentsof the RNA targeting CRISPR system stably into the genomic DNA of aplant cell. Additionally or alternatively, it is envisaged to introducethe components of the RNA targeting CRISPR system for stable integrationinto the DNA of a plant organelle such as, but not limited to a plastid,e mitochondrion or a chloroplast.

The expression system for stable integration into the genome of a plantcell may contain one or more of the following elements: a promoterelement that can be used to express the guide RNA and/or RNA targetingenzyme in a plant cell; a 5′ untranslated region to enhance expression;an intron element to further enhance expression in certain cells, suchas monocot cells; a multiple-cloning site to provide convenientrestriction sites for inserting the one or more guide RNAs and/or theRNA targeting gene sequences and other desired elements; and a 3′untranslated region to provide for efficient termination of theexpressed transcript.

The elements of the expression system may be on one or more expressionconstructs which are either circular such as a plasmid or transformationvector, or non-circular such as linear double stranded DNA.

In a particular embodiment, a RNA targeting CRISPR expression systemcomprises at least:

-   (a) a nucleotide sequence encoding a guide RNA (gRNA) that    hybridizes with a target sequence in a plant, and wherein the guide    RNA comprises a guide sequence and a direct repeat sequence, and-   (b) a nucleotide sequence encoding a RNA targeting protein, wherein    components (a) or (b) are located on the same or on different    constructs, and whereby the different nucleotide sequences can be    under control of the same or a different regulatory element operable    in a plant cell.

DNA construct(s) containing the components of the RNA targeting CRISPRsystem, may be introduced into the genome of a plant, plant part, orplant cell by a variety of conventional techniques. The processgenerally comprises the steps of selecting a suitable host cell or hosttissue, introducing the construct(s) into the host cell or host tissue,and regenerating plant cells or plants therefrom.

In particular embodiments, the DNA construct may be introduced into theplant cell using techniques such as but not limited to electroporation,microinjection, aerosol beam injection of plant cell protoplasts, or theDNA constructs can be introduced directly to plant tissue usingbiolistic methods, such as DNA particle bombardment (see also Fu et al.,Transgenic Res. 2000 February; 9(1):11-9). The basis of particlebombardment is the acceleration of particles coated with gene/s ofinterest toward cells, resulting in the penetration of the protoplasm bythe particles and typically stable integration into the genome. (seee.g. Klein et al, Nature (1987), Klein et al, Bio/Technology (1992),Casas et al, Proc. Natl. Acad. Sci. USA (1993).).

In particular embodiments, the DNA constructs containing components ofthe RNA targeting CRISPR system may be introduced into the plant byAgrobacterium-mediated transformation. The DNA constructs may becombined with suitable T-DNA flanking regions and introduced into aconventional Agrobacterium tumefaciens host vector. The foreign DNA canbe incorporated into the genome of plants by infecting the plants or byincubating plant protoplasts with Agrobacterium bacteria, containing oneor more Ti (tumor-inducing) plasmids. (see e.g. Fraley et al., (1985),Rogers et al., (1987) and U.S. Pat. No. 5,563,055).

Plant Promoters

In order to ensure appropriate expression in a plant cell, thecomponents of the CRISPR-Cas CRISPR system described herein aretypically placed under control of a plant promoter, i.e. a promoteroperable in plant cells. The use of different types of promoters isenvisaged.

A constitutive plant promoter is a promoter that is able to express theopen reading frame (ORF) that it controls in all or nearly all of theplant tissues during all or nearly all developmental stages of the plant(referred to as “constitutive expression”). One non-limiting example ofa constitutive promoter is the cauliflower mosaic virus 35S promoter.The present invention envisages methods for modifying RNA sequences andas such also envisages regulating expression of plant biomolecules. Inparticular embodiments of the present invention it is thus advantageousto place one or more elements of the RNA targeting CRISPR system underthe control of a promoter that can be regulated. “Regulated promoter”refers to promoters that direct gene expression not constitutively, butin a temporally- and/or spatially-regulated manner, and includestissue-specific, tissue-preferred and inducible promoters. Differentpromoters may direct the expression of a gene in different tissues orcell types, or at different stages of development, or in response todifferent environmental conditions. In particular embodiments, one ormore of the RNA targeting CRISPR components are expressed under thecontrol of a constitutive promoter, such as the cauliflower mosaic virus35S promoter issue-preferred promoters can be utilized to targetenhanced expression in certain cell types within a particular planttissue, for instance vascular cells in leaves or roots or in specificcells of the seed. Examples of particular promoters for use in the RNAtargeting CRISPR system-are found in Kawamata et al., (1997) Plant CellPhysiol 38:792-803; Yamamoto et al., (1997) Plant J 12:255-65; Hire etal, (1992) Plant Mol Biol 20:207-18,Kuster et al, (1995) Plant Mol Biol29:759-72, and Capana et al., (1994) Plant Mol Biol 25:681-91.

Examples of promoters that are inducible and that allow forspatiotemporal control of gene editing or gene expression may use a formof energy. The form of energy may include but is not limited to soundenergy, electromagnetic radiation, chemical energy and/or thermalenergy. Examples of inducible systems include tetracycline induciblepromoters (Tet-On or Tet-Off), small molecule two-hybrid transcriptionactivations systems (FKBP, ABA, etc), or light inducible systems(Phytochrome, LOV domains, or cryptochrome), such as a Light InducibleTranscriptional Effector (LITE) that direct changes in transcriptionalactivity in a sequence-specific manner. The components of a lightinducible system may include a RNA targeting CRISPR-Cas, alight-responsive cytochrome heterodimer (e.g. from Arabidopsisthaliana), and a transcriptional activation/repression domain. Furtherexamples of inducible DNA binding proteins and methods for their use areprovided in U.S. 61/736,465 and U.S. 61/721,283, which is herebyincorporated by reference in its entirety.

In particular embodiments, transient or inducible expression can beachieved by using, for example, chemical-regulated promotors, i.e.whereby the application of an exogenous chemical induces geneexpression. Modulating of gene expression can also be obtained by achemical-repressible promoter, where application of the chemicalrepresses gene expression. Chemical-inducible promoters include, but arenot limited to, the maize 1n2-2 promoter, activated by benzenesulfonamide herbicide safeners (De Veylder et al., (1997) Plant CellPhysiol 38:568-77), the maize GST promoter (GST-11-27, WO93/01294),activated by hydrophobic electrophilic compounds used as pre-emergentherbicides, and the tobacco PR-1 a promoter (Ono et al., (2004) BiosciBiotechnol Biochem 68:803-7) activated by salicylic acid. Promoterswhich are regulated by antibiotics, such as tetracycline-inducible andtetracycline-repressible promoters (Gatz et al., (1991) Mol Gen Genet227:229-37; U.S. Pat. Nos. 5,814,618 and 5,789,156) can also be usedherein.

Translocation to and/or Expression in Specific Plant Organelles

The expression system may comprise elements for translocation to and/orexpression in a specific plant organelle.

Chloroplast Targeting

In particular embodiments, it is envisaged that the RNA targeting CRISPRsystem is used to specifically modify expression and/or translation ofchloroplast genes or to ensure expression in the chloroplast. For thispurpose use is made of chloroplast transformation methods orcompartmentalization of the RNA targeting CRISPR components to thechloroplast. For instance, the introduction of genetic modifications inthe plastid genome can reduce biosafety issues such as gene flow throughpollen.

Methods of chloroplast transformation are known in the art and includeParticle bombardment, PEG treatment, and microinjection. Additionally,methods involving the translocation of transformation cassettes from thenuclear genome to the plastid can be used as described in WO2010061186.

Alternatively, it is envisaged to target one or more of the RNAtargeting CRISPR components to the plant chloroplast. This is achievedby incorporating in the expression construct a sequence encoding achloroplast transit peptide (CTP) or plastid transit peptide, operablylinked to the 5′ region of the sequence encoding the RNA targetingprotein. The CTP is removed in a processing step during translocationinto the chloroplast. Chloroplast targeting of expressed proteins iswell known to the skilled artisan (see for instance Protein Transportinto Chloroplasts, 2010, Annual Review of Plant Biology, Vol. 61:157-180). In such embodiments it is also desired to target the one ormore guide RNAs to the plant chloroplast. Methods and constructs whichcan be used for translocating guide RNA into the chloroplast by means ofa chloroplast localization sequence are described, for instance, in US20040142476, incorporated herein by reference. Such variations ofconstructs can be incorporated into the expression systems of theinvention to efficiently translocate the RNA targeting-guide RNA(s).

Introduction of Polynucleotides Encoding the CRISPR-RNA Targeting Systemin Algal Cells.

Transgenic algae (or other plants such as rape) may be particularlyuseful in the production of vegetable oils or biofuels such as alcohols(especially methanol and ethanol) or other products. These may beengineered to express or overexpress high levels of oil or alcohols foruse in the oil or biofuel industries.

U.S. Pat. No. 8,945,839 describes a method for engineering Micro-Algae(Chlamydomonas reinhardtii cells) species) using Cas9. Using similartools, the methods of the RNA targeting CRISPR system described hereincan be applied on Chlamydomonas species and other algae. In particularembodiments, RNA targeting protein and guide RNA(s) are introduced inalgae expressed using a vector that expresses RNA targeting proteinunder the control of a constitutive promoter such as Hsp70A-Rbc S2 orBeta2-tubulin. Guide RNA is optionally delivered using a vectorcontaining T7 promoter. Alternatively, RNA targeting mRNA and in vitrotranscribed guide RNA can be delivered to algal cells. Electroporationprotocols are available to the skilled person such as the standardrecommended protocol from the GeneArt Chlamydomonas Engineering kit.

Introduction of Polynucleotides Encoding RNA Targeting Components inYeast Cells

In particular embodiments, the invention relates to the use of the RNAtargeting CRISPR system for RNA editing in yeast cells. Methods fortransforming yeast cells which can be used to introduce polynucleotidesencoding the RNA targeting CRISPR system components are well known tothe artisan and are reviewed by Kawai et al., 2010, Bioeng Bugs. 2010November-December; 1(6): 395-403). Non-limiting examples includetransformation of yeast cells by lithium acetate treatment (which mayfurther include carrier DNA and PEG treatment), bombardment or byelectroporation.

Transient Expression of RNA Targeting Crisp System Components in Plantsand Plant Cell

In particular embodiments, it is envisaged that the guide RNA and/or RNAtargeting gene are transiently expressed in the plant cell. In theseembodiments, the RNA targeting CRISPR system can ensure modification ofRNA target molecules only when both the guide RNA and the RNA targetingprotein is present in a cell, such that gene expression can further becontrolled. As the expression of the RNA targeting enzyme is transient,plants regenerated from such plant cells typically contain no foreignDNA. In particular embodiments the RNA targeting enzyme is stablyexpressed by the plant cell and the guide sequence is transientlyexpressed.

In particularly preferred embodiments, the RNA targeting CRISPR systemcomponents can be introduced in the plant cells using a plant viralvector (Scholthof et al. 1996, Annu Rev Phytopathol. 1996; 34:299-323).In further particular embodiments, said viral vector is a vector from aDNA virus. For example, geminivirus (e.g., cabbage leaf curl virus, beanyellow dwarf virus, wheat dwarf virus, tomato leaf curl virus, maizestreak virus, tobacco leaf curl virus, or tomato golden mosaic virus) ornanovirus (e.g., Faba bean necrotic yellow virus). In other particularembodiments, said viral vector is a vector from an RNA virus. Forexample, tobravirus (e.g., tobacco rattle virus, tobacco mosaic virus),potexvirus (e.g., potato virus X), or hordeivirus (e.g., barley stripemosaic virus). The replicating genomes of plant viruses arenon-integrative vectors, which is of interest in the context of avoidingthe production of GMO plants.

In particular embodiments, the vector used for transient expression ofRNA targeting CRISPR constructs is for instance a pEAQ vector, which istailored for Agrobacterium-mediated transient expression (Sainsbury F.et al., Plant Biotechnol J. 2009 September; 7(7):682-93) in theprotoplast. Precise targeting of genomic locations was demonstratedusing a modified Cabbage Leaf Curl virus (CaLCuV) vector to expressgRNAs in stable transgenic plants expressing a Cas13 (see ScientificReports 5, Article number: 14926 (2015), doi:10.1038/srep14926).

In particular embodiments, double-stranded DNA fragments encoding theguide RNA or crRNA and/or the RNA targeting gene can be transientlyintroduced into the plant cell. In such embodiments, the introduceddouble-stranded DNA fragments are provided in sufficient quantity tomodify RNA molecule(s) in the cell but do not persist after acontemplated period of time has passed or after one or more celldivisions. Methods for direct DNA transfer in plants are known by theskilled artisan (see for instance Davey et al. Plant Mol Biol. 1989September; 13(3):273-85.)

In other embodiments, an RNA polynucleotide encoding the RNA targetingprotein is introduced into the plant cell, which is then translated andprocessed by the host cell generating the protein in sufficient quantityto modify the RNA molecule(s) cell (in the presence of at least oneguide RNA) but which does not persist after a contemplated period oftime has passed or after one or more cell divisions. Methods forintroducing mRNA to plant protoplasts for transient expression are knownby the skilled artisan (see for instance in Gallie, Plant Cell Reports(1993), 13; 119-122). Combinations of the different methods describedabove are also envisaged.

Delivery of RNA Targeting CRISPR Components to the Plant Cell

In particular embodiments, it is of interest to deliver one or morecomponents of the RNA targeting CRISPR system directly to the plantcell. This is of interest, inter alia, for the generation ofnon-transgenic plants. In particular embodiments, one or more of the RNAtargeting components is prepared outside the plant or plant cell anddelivered to the cell. For instance in particular embodiments, the RNAtargeting protein is prepared in vitro prior to introduction to theplant cell. RNA targeting protein can be prepared by various methodsknown by one of skill in the art and include recombinant production.After expression, the RNA targeting protein is isolated, refolded ifneeded, purified and optionally treated to remove any purification tags,such as a His-tag. Once crude, partially purified, or more completelypurified RNA targeting protein is obtained, the protein may beintroduced to the plant cell.

In particular embodiments, the RNA targeting protein is mixed with guideRNA targeting the RNA of interest to form a pre-assembledribonucleoprotein.

The individual components or pre-assembled ribonucleoprotein can beintroduced into the plant cell via electroporation, by bombardment withRNA targeting-associated gene product coated particles, by chemicaltransfection or by some other means of transport across a cell membrane.For instance, transfection of a plant protoplast with a pre-assembledCRISPR ribonucleoprotein has been demonstrated to ensure targetedmodification of the plant genome (as described by Woo et al. NatureBiotechnology, 2015; DOI: 10.1038/nbt.3389). These methods can bemodified to achieve targeted modification of RNA molecules in theplants.

In particular embodiments, the RNA targeting CRISPR system componentsare introduced into the plant cells using nanoparticles. The components,either as protein or nucleic acid or in a combination thereof, can beuploaded onto or packaged in nanoparticles and applied to the plants(such as for instance described in WO 2008042156 and US 20130185823). Inparticular, embodiments of the invention comprise nanoparticles uploadedwith or packed with DNA molecule(s) encoding the RNA targeting protein,DNA molecules encoding the guide RNA and/or isolated guide RNA asdescribed in WO2015089419.

Further means of introducing one or more components of the RNA targetingCRISPR system to the plant cell is by using cell penetrating peptides(CPP). Accordingly, in particular, embodiments the invention comprisescompositions comprising a cell penetrating peptide linked to an RNAtargeting protein. In particular embodiments of the present invention,an RNA targeting protein and/or guide RNA(s) is coupled to one or moreCPPs to effectively transport them inside plant protoplasts (Ramakrishna(2014, Genome Res. 2014 June; 24(6):1020-7 for Cas9 in human cells). Inother embodiments, the RNA targeting gene and/or guide RNA(s) areencoded by one or more circular or non-circular DNA molecule(s) whichare coupled to one or more CPPs for plant protoplast delivery. The plantprotoplasts are then regenerated to plant cells and further to plants.CPPs are generally described as short peptides of fewer than 35 aminoacids either derived from proteins or from chimeric sequences which arecapable of transporting biomolecules across cell membrane in a receptorindependent manner. CPP can be cationic peptides, peptides havinghydrophobic sequences, amphipatic peptides, peptides having proline-richand anti-microbial sequence, and chimeric or bipartite peptides (Poogaand Langel 2005). CPPs are able to penetrate biological membranes and assuch trigger the movement of various biomolecules across cell membranesinto the cytoplasm and to improve their intracellular routing, and hencefacilitate interaction of the biolomolecule with the target. Examples ofCPP include amongst others: Tat, a nuclear transcriptional activatorprotein required for viral replication by HIV typel, penetratin, Kaposifibroblast growth factor (FGF) signal peptide sequence, integrin (33signal peptide sequence; polyarginine peptide Args sequence, Guaninerich-molecular transporters, sweet arrow peptide, etc.

Target RNA Envisaged for Plant, Algae or Fungal Applications

The target RNA, i.e. the RNA of interest, is the RNA to be targeted bythe present invention leading to the recruitment to, and the binding ofthe RNA targeting protein at, the target site of interest on the targetRNA. The target RNA may be any suitable form of RNA. This may include,in some embodiments, mRNA. In other embodiments, the target RNA mayinclude transfer RNA (tRNA) or ribosomal RNA (rRNA). In otherembodiments the target RNA may include interfering RNA (RNAi), microRNA(miRNA), microswitches, microzymes, satellite RNAs and RNA viruses. Thetarget RNA may be located in the cytoplasm of the plant cell, or in thecell nucleus or in a plant cell organelle such as a mitochondrion,chloroplast or plastid.

In particular embodiments, the RNA targeting CRISPR system is used tocleave RNA or otherwise inhibit RNA expression.

Use of RNA Targeting CRISPR System for Modulating Plant Gene ExpressionVia RNA Modulation

The RNA targeting protein may also be used, together with a suitableguide RNA, to target gene expression, via control of RNA processing. Thecontrol of RNA processing may include RNA processing reactions such asRNA splicing, including alternative splicing; viral replication (inparticular of plant viruses, including virioids in plants and tRNAbiosynthesis. The RNA targeting protein in combination with a suitableguide RNA may also be used to control RNA activation (RNAa). RNAa leadsto the promotion of gene expression, so control of gene expression maybe achieved that way through disruption or reduction of RNAa and thusless promotion of gene expression.

The RNA targeting effector protein of the invention can further be usedfor antiviral activity in plants, in particular against RNA viruses. Theeffector protein can be targeted to the viral RNA using a suitable guideRNA selective for a selected viral RNA sequence. In particular, theeffector protein may be an active nuclease that cleaves RNA, such assingle stranded RNA. provided is therefore the use of an RNA targetingeffector protein of the invention as an antiviral agent. Examples ofviruses that can be counteracted in this way include, but are notlimited to, Tobacco mosaic virus (TMV), Tomato spotted wilt virus(TSWV), Cucumber mosaic virus (CMV), Potato virus Y (PVY), Cauliflowermosaic virus (CaMV) (RT virus), Plum pox virus (PPV), Brome mosaic virus(BMV) and Potato virus X (PVX).

Examples of modulating RNA expression in plants, algae or fungi, as analternative of targeted gene modification are described herein further.

Of particular interest is the regulated control of gene expressionthrough regulated cleavage of mRNA. This can be achieved by placingelements of the RNA targeting under the control of regulated promotersas described herein.

Use of the RNA Targeting CRISPR System to Restore the Functionality ofTRNA Molecules.

Pring et al describe RNA editing in plant mitochondria and chloroplaststhat alters mRNA sequences to code for different proteins than the DNA.(Plant Mol. Biol. (1993) 21 (6): 1163-1170. doi:10.1007/BF00023611). Inparticular embodiments of the invention, the elements of the RNAtargeting CRISPR system specifically targeting mitochondrial andchloroplast rnRNA can be introduced in a plant or plant cell to expressdifferent proteins in such plant cell organelles mimicking the processesoccurring in vivo.

Use of the RNA Targeting CRISPR System as an Alternative to RNAInterference to Inhibit RNA Expression.

The RNA targeting CRISPR system has uses similar to RNA inhibition orRNA interference, thus can also be substituted for such methods. Inparticular embodiment, the methods of the present invention include theuse of the RNA targeting CRISPR as a substitute for e.g. an interferingribonucleic acid (such as an siRNA or shRNA or a dsRNA). Examples ofinhibition of RNA expression in plants, algae or fungi as an alternativeof targeted gene modification are described herein further.

Use of the RNA Targeting CRISPR System to Control RNA Interference.

Control over interfering RNA or miRNA may help reduce off-target effects(OTE) seen with those approaches by reducing the longevity of theinterfering RNA or miRNA in vivo or in vitro. In particular embodiments,the target RNA may include interfering RNA, i.e. RNA involved in an RNAinterference pathway, such as shRNA, siRNA and so forth. In otherembodiments, the target RNA may include microRNA (miRNA) or doublestranded RNA (dsRNA).

In other particular embodiments, if the RNA targeting protein andsuitable guide RNA(s) are selectively expressed (for example spatiallyor temporally under the control of a regulated promoter, for example atissue- or cell cycle-specific promoter and/or enhancer) this can beused to ‘protect’ the cells or systems (in vivo or in vitro) from RNAiin those cells. This may be useful in neighboring tissues or cells whereRNAi is not required or for the purposes of comparison of the cells ortissues where the effector protein and suitable guide are and are notexpressed (i.e. where the RNAi is not controlled and where it is,respectively). The RNA targeting protein may be used to control or bindto molecules comprising or consisting of RNA, such as ribozymes,ribosomes or riboswitches. In embodiments of the invention, the guideRNA can recruit the RNA targeting protein to these molecules so that theRNA targeting protein is able to bind to them.

The RNA targeting CRISPR system of the invention can be applied in areasof in-planta RNAi technologies, without undue experimentation, from thisdisclosure, including insect pest management, plant disease managementand management of herbicide resistance, as well as in plant assay andfor other applications (see, for instance Kim et al., in PesticideBiochemistry and Physiology (Impact Factor: 2.01). 01/2015; 120. DOI:10.1016/j.pestbp.2015.01.002; Sharma et al. in Academic Journals (2015),Vol. 12(18) pp 2303-2312); Green J. M, inPest Management Science, Vol70(9), pp 1351-1357), because the present application provides thefoundation for informed engineering of the system.

Use of RNA Targeting CRISPR System to Modify Riboswitches and ControlMetabolic Regulation in Plants, Algae and Fungi

Riboswitches (also known as aptozymes) are regulatory segments ofmessenger RNA that bind small molecules and in turn regulate geneexpression. This mechanism allows the cell to sense the intracellularconcentration of these small molecules. A particular riboswitchtypically regulates its adjacent gene by altering the transcription, thetranslation or the splicing of this gene. Thus, in particularembodiments of the present invention, control of riboswitch activity isenvisaged through the use of the RNA targeting protein in combinationwith a suitable guide RNA to target the riboswitch. This may be throughcleavage of, or binding to, the riboswitch. In particular embodiments,reduction of riboswitch activity is envisaged. Recently, a riboswitchthat binds thiamin pyrophosphate (TPP) was characterized and found toregulate thiamin biosynthesis in plants and algae. Furthermore itappears that this element is an essential regulator of primarymetabolism in plants (Bocobza and Aharoni, Plant J. 2014 August;79(4):693-703. doi: 10.1111/tpj.12540. Epub 2014 Jun. 17). TPPriboswitches are also found in certain fungi, such as in Neurosporacrassa, where it controls alternative splicing to conditionally producean Upstream Open Reading Frame (uORF), thereby affecting the expressionof downstream genes (Cheah M T et al., (2007) Nature 447 (7143):497-500. doi:10.1038/nature05769) The RNA targeting CRISPR systemdescribed herein may be used to manipulate the endogenous riboswitchactivity in plants, algae or fungi and as such alter the expression ofdownstream genes controlled by it. In particular embodiments, the RNAtargeting CRISP system may be used in assaying riboswitch function invivo or in vitro and in studying its relevance for the metabolicnetwork. In particular embodiments the RNA targeting CRISPR system maypotentially be used for engineering of riboswitches as metabolitesensors in plants and platforms for gene control.

Use of RNA Targeting CRISPR System in RNAi Screens for Plants, Algae orFungi

Identifying gene products whose knockdown is associated with phenotypicchanges, biological pathways can be interrogated and the constituentparts identified, via RNAi screens. In particular embodiments of theinvention, control may also be exerted over or during these screens byuse of the Guide 29 or Guide 30 protein and suitable guide RNA describedherein to remove or reduce the activity of the RNAi in the screen andthus reinstate the activity of the (previously interfered with) geneproduct (by removing or reducing the interference/repression).

Use of RNA Targeting Proteins for Visualization of RNA Molecules In Vivoand In Vitro

In particular embodiments, the invention provides a nucleic acid bindingsystem. In situ hybridization of RNA with complementary probes is apowerful technique. Typically fluorescent DNA oligonucleotides are usedto detect nucleic acids by hybridization. Increased efficiency has beenattained by certain modifications, such as locked nucleic acids (LNAs),but there remains a need for efficient and versatile alternatives. Assuch, labelled elements of the RNA targeting system can be used as analternative for efficient and adaptable system for in situ hybridization

Further Applications of the RNA Targeting CRISPR System in Plants andYeasts Use of RNA Targeting CRISPR System in Biofuel Production

The term “biofuel” as used herein is an alternative fuel made from plantand plant-derived resources. Renewable biofuels can be extracted fromorganic matter whose energy has been obtained through a process ofcarbon fixation or are made through the use or conversion of biomass.This biomass can be used directly for biofuels or can be converted toconvenient energy containing substances by thermal conversion, chemicalconversion, and biochemical conversion. This biomass conversion canresult in fuel in solid, liquid, or gas form. There are two types ofbiofuels: bioethanol and biodiesel. Bioethanol is mainly produced by thesugar fermentation process of cellulose (starch), which is mostlyderived from maize and sugar cane. Biodiesel on the other hand is mainlyproduced from oil crops such as rapeseed, palm, and soybean. Biofuelsare used mainly for transportation.

Enhancing Plant Properties for Biofuel Production

In particular embodiments, the methods using the RNA targeting CRISPRsystem as described herein are used to alter the properties of the cellwall in order to facilitate access by key hydrolysing agents for a moreefficient release of sugars for fermentation. In particular embodiments,the biosynthesis of cellulose and/or lignin are modified. Cellulose isthe major component of the cell wall. The biosynthesis of cellulose andlignin are co-regulated. By reducing the proportion of lignin in a plantthe proportion of cellulose can be increased. In particular embodiments,the methods described herein are used to downregulate ligninbiosynthesis in the plant so as to increase fermentable carbohydrates.More particularly, the methods described herein are used to downregulateat least a first lignin biosynthesis gene selected from the groupconsisting of 4-coumarate 3-hydroxylase (C3H), phenylalanineammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), hydroxycinnamoyltransferase (HCT), caffeic acid O-methyltransferase (COMT), caffeoyl CoA3-O-methyltransferase (CCoAOMT), ferulate 5-hydroxylase (F5H), cinnamylalcohol dehydrogenase (CAD), cinnamoyl CoA-reductase (CCR),4-coumarate-CoA ligase (4CL), monolignol-lignin-specificglycosyltransferase, and aldehyde dehydrogenase (ALDH) as disclosed inWO 2008064289 A2.

In particular embodiments, the methods described herein are used toproduce plant mass that produces lower levels of acetic acid duringfermentation (see also WO 2010096488).

Modifying Yeast for Biofuel Production

In particular embodiments, the RNA targeting enzyme provided herein isused for bioethanol production by recombinant micro-organisms. Forinstance, RNA targeting enzymes can be used to engineer micro-organisms,such as yeast, to generate biofuel or biopolymers from fermentablesugars and optionally to be able to degrade plant-derived lignocellulosederived from agricultural waste as a source of fermentable sugars. Moreparticularly, the invention provides methods whereby the RNA targetingCRISPR complex is used to modify the expression of endogenous genesrequired for biofuel production and/or to modify endogenous genes whymay interfere with the biofuel synthesis. More particularly the methodsinvolve stimulating the expression in a micro-organism such as a yeastof one or more nucleotide sequence encoding enzymes involved in theconversion of pyruvate to ethanol or another product of interest. Inparticular embodiments the methods ensure the stimulation of expressionof one or more enzymes which allows the micro-organism to degradecellulose, such as a cellulase. In yet further embodiments, the RNAtargeting CRISPR complex is used to suppress endogenous metabolicpathways which compete with the biofuel production pathway.

Modifying Algae and Plants for Production of Vegetable Oils or Biofuels

Transgenic algae or other plants such as rape may be particularly usefulin the production of vegetable oils or biofuels such as alcohols(especially methanol and ethanol), for instance. These may be engineeredto express or overexpress high levels of oil or alcohols for use in theoil or biofuel industries.

U.S. Pat. No. 8,945,839 describes a method for engineering Micro-Algae(Chlamydomonas reinhardtii cells) species) using Cas9. Using similartools, the methods of the RNA targeting CRISPR system described hereincan be applied on Chlamydomonas species and other algae. In particularembodiments, the RNA targeting effector protein and guide RNA areintroduced in algae expressed using a vector that expresses the RNAtargeting effector protein under the control of a constitutive promotersuch as Hsp70A-Rbc S2 or Beta2-tubulin. Guide RNA will be deliveredusing a vector containing T7 promoter. Alternatively, in vitrotranscribed guide RNA can be delivered to algae cells. Electroporationprotocol follows standard recommended protocol from the GeneArtChlamydomonas Engineering kit.

Particular Applications of the RNA Targeting Enzymes in Plants

In particular embodiments, present invention can be used as a therapyfor virus removal in plant systems as it is able to cleave viral RNA.Previous studies in human systems have demonstrated the success ofutilizing CRISPR in targeting the single strand RNA virus, hepatitis C(A. Price, et al., Proc. Natl. Acad. Sci, 2015). These methods may alsobe adapted for using the RNA targeting CRISPR system in plants.

Improved Plants

The present invention also provides plants and yeast cells obtainableand obtained by the methods provided herein. The improved plantsobtained by the methods described herein may be useful in food or feedproduction through the modified expression of genes which, for instanceensure tolerance to plant pests, herbicides, drought, low or hightemperatures, excessive water, etc.

The improved plants obtained by the methods described herein, especiallycrops and algae may be useful in food or feed production throughexpression of, for instance, higher protein, carbohydrate, nutrient orvitamin levels than would normally be seen in the wildtype. In thisregard, improved plants, especially pulses and tubers are preferred.

Improved algae or other plants such as rape may be particularly usefulin the production of vegetable oils or biofuels such as alcohols(especially methanol and ethanol), for instance. These may be engineeredto express or overexpress high levels of oil or alcohols for use in theoil or biofuel industries.

The invention also provides for improved parts of a plant. Plant partsinclude, but are not limited to, leaves, stems, roots, tubers, seeds,endosperm, ovule, and pollen. Plant parts as envisaged herein may beviable, nonviable, regeneratable, and/or non-regeneratable.

It is also encompassed herein to provide plant cells and plantsgenerated according to the methods of the invention. Gametes, seeds,embryos, either zygotic or somatic, progeny or hybrids of plantscomprising the genetic modification, which are produced by traditionalbreeding methods, are also included within the scope of the presentinvention. Such plants may contain a heterologous or foreign DNAsequence inserted at or instead of a target sequence. Alternatively,such plants may contain only an alteration (mutation, deletion,insertion, substitution) in one or more nucleotides. As such, suchplants will only be different from their progenitor plants by thepresence of the particular modification.

In an embodiment of the invention, a CRISPR-Cas system is used toengineer pathogen resistant plants, for example by creating resistanceagainst diseases caused by bacteria, fungi or viruses. In certainembodiments, pathogen resistance can be accomplished by engineeringcrops to produce a CRISPR-Cas system that will be ingested by an insectpest, leading to mortality. In an embodiment of the invention, aCRISPR-Cas system is used to engineer abiotic stress tolerance. Inanother embodiment, a CRISPR-Cas system is used to engineer droughtstress tolerance or salt stress tolerance, or cold or heat stresstolerance. Younis et al. 2014, Int. J. Biol. Sci. 10; 1150 reviewedpotential targets of plant breeding methods, all of which are amenableto correction or improvement through use of a CRISPR-Cas systemdescribed herein. Some non-limiting target crops include Arabidops Zeamays is thaliana, Oryza sativa L, Prunus domestica L., Gossypiumhirsutum, Nicotiana rustica, Zea mays, Medicago sativa, Nicotianabenthamiana and Arabidopsis thaliana

In an embodiment of the invention, a CRISPR-Cas system is used formanagement of crop pests. For example, a CRISPR-Cas system operable in acrop pest can be expressed from a plant host or transferred directly tothe target, for example using a viral vector.

In an embodiment, the invention provides a method of efficientlyproducing homozygous organisms from a heterozygous non-human startingorganism. In an embodiment, the invention is used in plant breeding. Inanother embodiment, the invention is used in animal breeding. In suchembodiments, a homozygous organism such as a plant or animal is made bypreventing or suppressing recombination by interfering with at least onetarget gene involved in double strand breaks, chromosome pairing and/orstrand exchange.

CRISPR-Cas Effector Protein Complexes can be Used in Plants

The invention in some embodiments comprehends a method of modifying ancell or organism. The cell may be a prokaryotic cell or a eukaryoticcell. The cell may be a mammalian cell. The mammalian cell many be anon-human primate, bovine, porcine, rodent or mouse cell. The cell maybe a non-mammalian eukaryotic cell such as poultry, fish or shrimp. Thecell may also be a plant cell. The plant cell may be of a crop plantsuch as cassava, corn, sorghum, wheat, or rice. The plant cell may alsobe of an algae, tree or vegetable. The modification introduced to thecell by the present invention may be such that the cell and progeny ofthe cell are altered for improved production of biologic products suchas an antibody, starch, alcohol or other desired cellular output. Themodification introduced to the cell by the present invention may be suchthat the cell and progeny of the cell include an alteration that changesthe biologic product produced. The system may comprise one or moredifferent vectors. In an aspect of the invention, the effector proteinis codon optimized for expression the desired cell type, preferentiallya eukaryotic cell, preferably a mammalian cell or a human cell.CRISPR-Cas system(s) (e.g., single or multiplexed) can be used inconjunction with recent advances in crop genomics. Such CRISPR system(s)can be used to perform efficient and cost effective plant gene or genomeor transcriptome interrogation or editing or manipulation—for instance,for rapid investigation and/or selection and/or interrogations and/orcomparison and/or manipulations and/or transformation of plant genes orgenomes; e.g., to create, identify, develop, optimize, or confertrait(s) or characteristic(s) to plant(s) or to transform a plantgenome. There can accordingly be improved production of plants, newplants with new combinations of traits or characteristics or new plantswith enhanced traits. Such CRISPR system(s) can be used with regard toplants in Site-Directed Integration (SDI) or Gene Editing (GE) or anyNear Reverse Breeding (NRB) or Reverse Breeding (RB) techniques.Accordingly, reference herein to animal cells may also apply, mutatismutandis, to plant cells unless otherwise apparent; and, the enzymesherein having reduced off-target effects and systems employing suchenzymes can be used in plant applications, including those mentionedherein. Engineered plants modified by the effector protein and suitableguide (crRNA), and progeny thereof, as provided. These may includedisease or drought resistant crops, such as wheat, barley, rice, soybeanor corn; plants modified to remove or reduce the ability toself-pollinate (but which can instead, optionally, hybridise instead);and allergenic foods such as peanuts and nuts where the immunogenicproteins have been disabled, destroyed or disrupted by targeting via aeffector protein and suitable guide. Any aspect of using classicalCRISPR-Cas systems may be adapted to use in CRISPR systems that are Casprotein agnostic, e.g. Cas13 effector protein systems.

Models of Conditions

A method of the invention may be used to create a plant, an animal orcell that may be used to model and/or study genetic or epigeneticconditions of interest, such as a through a model of mutations ofinterest or a disease model. As used herein, “disease” refers to adisease, disorder, or indication in a subject. For example, a method ofthe invention may be used to create an animal or cell that comprises amodification in one or more nucleic acid sequences associated with adisease, or a plant, animal or cell in which expression of one or morenucleic acid sequences associated with a disease are altered. Such anucleic acid sequence may encode or be translated a disease associatedprotein sequence or may be a disease associated control sequence.Accordingly, it is understood that in embodiments of the invention, aplant, subject, patient, organism or cell can be a non-human subject,patient, organism or cell. Thus, the invention provides a plant, animalor cell, produced by the present methods, or a progeny thereof. Theprogeny may be a clone of the produced plant or animal, or may resultfrom sexual reproduction by crossing with other individuals of the samespecies to introgress further desirable traits into their offspring. Thecell may be in vivo or ex vivo in the cases of multicellular organisms,particularly animals or plants. In the instance where the cell is incultured, a cell line may be established if appropriate culturingconditions are met and preferably if the cell is suitably adapted forthis purpose (for instance a stem cell). Bacterial cell lines producedby the invention are also envisaged. Hence, cell lines are alsoenvisaged. In some methods, the disease model can be used to study theeffects of mutations, or more general altered, such as reduced,expression of genes or gene products on the animal or cell anddevelopment and/or progression of the disease using measures commonlyused in the study of the disease. Alternatively, such a disease model isuseful for studying the effect of a pharmaceutically active compound onthe disease. In some methods, the disease model can be used to assessthe efficacy of a potential gene therapy strategy. That is, adisease-associated RNA can be modified such that the disease developmentand/or progression is displayed or inhibited or reduced and then effectsof a compound on the progression or inhibition or reduction are tested.

Useful in the practice of the instant invention utilizing CRISPR-Caseffector proteins and complexes thereof and nucleic acid moleculesencoding same and methods using same, reference is made to: Genome-ScaleCRISPR-Cas9 Knockout Screening in Human Cells. Shalem, O., Sanjana, NE., Hartenian, E., Shi, X., Scott, D A., Mikkelson, T., Heckl, D.,Ebert, B L., Root, D E., Doench, J G., Zhang, F. Science December 12.(2013). [Epub ahead of print]; Published in final edited form as:Science. 2014 Jan. 3; 343(6166): 84-87. Shalem et al. involves a new wayto interrogate gene function on a genome-wide scale. Their studiesshowed that delivery of a genome-scale CRISPR-Cas9 knockout (GeCKO)library targeted 18,080 genes with 64,751 unique guide sequences enabledboth negative and positive selection screening in human cells. First,the authors showed use of the GeCKO library to identify genes essentialfor cell viability in cancer and pluripotent stem cells. Next, in amelanoma model, the authors screened for genes whose loss is involved inresistance to vemurafenib, a therapeutic that inhibits mutant proteinkinase BRAF. Their studies showed that the highest-ranking candidatesincluded previously validated genes NF1 and MED12 as well as novelhitsNF2, CUL3, TADA2B, and TADA1. The authors observed a high level ofconsistency between independent guide RNAs targeting the same gene and ahigh rate of hit confirmation, and thus demonstrated the promise ofgenome-scale screening with Cas9. Reference is also made to US patentpublication number US20140357530; and PCT Patent PublicationWO2014093701, hereby incorporated herein by reference.

The term “associated with” is used here in relation to the associationof the functional domain to the CRISPR-Cas effector protein or theadaptor protein. It is used in respect of how one molecule ‘associates’with respect to another, for example between an adaptor protein and afunctional domain, or between the CRISPR-Cas effector protein and afunctional domain. In the case of such protein-protein interactions,this association may be viewed in terms of recognition in the way anantibody recognizes an epitope. Alternatively, one protein may beassociated with another protein via a fusion of the two, for instanceone subunit being fused to another subunit. Fusion typically occurs byaddition of the amino acid sequence of one to that of the other, forinstance via splicing together of the nucleotide sequences that encodeeach protein or subunit. Alternatively, this may essentially be viewedas binding between two molecules or direct linkage, such as a fusionprotein. In any event, the fusion protein may include a linker betweenthe two subunits of interest (i.e. between the enzyme and the functionaldomain or between the adaptor protein and the functional domain). Thus,in some embodiments, the CRISPR-Cas effector protein or adaptor proteinis associated with a functional domain by binding thereto. In otherembodiments, the CRISPR-Cas effector protein or adaptor protein isassociated with a functional domain because the two are fused together,optionally via an intermediate linker.

Therapeutic Applications

The system of the invention can be applied in areas of former RNAcutting technologies, without undue experimentation, from thisdisclosure, including therapeutic, assay and other applications, becausethe present application provides the foundation for informed engineeringof the system. The present invention provides for therapeutic treatmentof a disease caused by overexpression of RNA, toxic RNA and/or mutatedRNA (such as, for example, splicing defects or truncations). Expressionof the toxic RNA may be associated with formation of nuclear inclusionsand late-onset degenerative changes in brain, heart or skeletal muscle.In the best studied example, myotonic dystrophy, it appears that themain pathogenic effect of the toxic RNA is to sequester binding proteinsand compromise the regulation of alternative splicing (Hum. Mol. Genet.(2006) 15 (suppl 2): R162-R169). Myotonic dystrophy [dystrophiamyotonica (DM)] is of particular interest to geneticists because itproduces an extremely wide range of clinical features. A partial listingwould include muscle wasting, cataracts, insulin resistance, testicularatrophy, slowing of cardiac conduction, cutaneous tumors and effects oncognition. The classical form of DM, which is now called DM type 1(DM1), is caused by an expansion of CTG repeats in the 3′-untranslatedregion (UTR) of DMPK, a gene encoding a cytosolic protein kinase.

The innate immune system detects viral infection primarily byrecognizing viral nucleic acids inside an infected cell, referred to asDNA or RNA sensing. In vitro RNA sensing assays can be used to detectspecific RNA substrates. The RNA targeting effector protein can forinstance be used for RNA-based sensing in living cells. Examples ofapplications are diagnostics by sensing of, for examples,disease-specific RNAs. The RNA targeting effector protein of theinvention can further be used for antiviral activity, in particularagainst RNA viruses. The effector protein can be targeted to the viralRNA using a suitable guide RNA selective for a selected viral RNAsequence. In particular, the effector protein may be an active nucleasethat cleaves RNA, such as single stranded RNA. Therapeutic dosages ofthe enzyme system of the present invention to target RNA theabove-referenced RNAs are contemplated to be about 0.1 to about 2 mg/kgthe dosages may be administered sequentially with a monitored response,and repeated dosages if necessary, up to about 7 to 10 doses perpatient. Advantageously, samples are collected from each patient duringthe treatment regimen to ascertain the effectiveness of treatment. Forexample, RNA samples may be isolated and quantified to determine ifexpression is reduced or ameliorated. Such a diagnostic is within thepurview of one of skill in the art.

In some examples, the disease is caused by a G→A or C→T point mutationor a pathogenic SNP. In some examples, the disease caused by a T→C orA→G point mutation or a pathogenic SNP. For example, the disease may becancer, haemophilia, beta-thalassemia, Marfan syndrome andWiskott-Aldrich syndrome.

Exemplary Therapies

The present invention also contemplates use of the CRISPR-Cas system andthe base editor described herein, for treatment in a variety of diseasesand disorders. In some embodiments, the invention described hereinrelates to a method for therapy in which cells are edited ex vivo byCRISPR or the base editor to modulate at least one gene, with subsequentadministration of the edited cells to a patient in need thereof. In someembodiments, the editing involves knocking in, knocking out or knockingdown expression of at least one target gene in a cell. In particularembodiments, the editing inserts an exogenous, gene, minigene orsequence, which may comprise one or more exons and introns or natural orsynthetic introns into the locus of a target gene, a hot-spot locus, asafe harbor locus of the gene genomic locations where new genes orgenetic elements can be introduced without disrupting the expression orregulation of adjacent genes, or correction by insertions or deletionsone or more mutations in DNA sequences that encode regulatory elementsof a target gene. In some embodiment, the editing comprise introducingone or more point mutations in a nucleic acid (e.g., a genomic DNA) in atarget cell.

In embodiments, the treatment is for disease/disorder of an organ,including liver disease, eye disease, muscle disease, heart disease,blood disease, brain disease, kidney disease, or may comprise treatmentfor an autoimmune disease, central nervous system disease, cancer andother proliferative diseases, neurodegenerative disorders, inflammatorydisease, metabolic disorder, musculoskeletal disorder and the like.

Particular diseases/disorders include chondroplasia, achromatopsia, acidmaltase deficiency, adrenoleukodystrophy, aicardi syndrome, alpha-1antitrypsin deficiency, alpha-thalassemia, androgen insensitivitysyndrome, apert syndrome, arrhythmogenic right ventricular, dysplasia,ataxia telangictasia, barth syndrome, beta-thalassemia, blue rubber blebnevus syndrome, canavan disease, chronic granulomatous diseases (CGD),cri du chat syndrome, cystic fibrosis, dercum's disease, ectodermaldysplasia, fanconi anemia, fibrodysplasia ossificans progressive,fragile X syndrome, galactosemis, Gaucher's disease, generalizedgangliosidoses (e.g., GM1), hemochromatosis, the hemoglobin C mutationin the 6th codon of beta-globin (HbC), hemophilia, Huntington's disease,Hurler Syndrome, hypophosphatasia, Klinefleter syndrome, KrabbesDisease, Langer-Giedion Syndrome, leukodystrophy, long QT syndrome,Marfan syndrome, Moebius syndrome, mucopolysaccharidosis (MPS), nailpatella syndrome, nephrogenic diabetes insipdius, neurofibromatosis,Neimann-Pick disease, osteogenesis imperfecta, Porphyria, Prader-Willisyndrome, progeria, Proteus syndrome, retinoblastoma, Rett syndrome,Rubinstein-Taybi syndrome, Sanfilippo syndrome, severe combinedimmunodeficiency (SCID), Shwachman syndrome, sickle cell disease (sicklecell anemia), Smith-Magenis syndrome, Stickler syndrome, Tay-Sachsdisease, Thrombocytopenia Absent Radius (TAR) syndrome, Treacher Collinssyndrome, trisomy, tuberous sclerosis, Turner's syndrome, urea cycledisorder, von Hippel-Landau disease, Waardenburg syndrome, Williamssyndrome, Wilson's disease, and Wiskott-Aldrich syndrome.

In embodiments, the disease is associated with expression of a tumorantigen, e.g., a proliferative disease, a precancerous condition, acancer, or a non-cancer related indication associated with expression ofthe tumor antigen, which may in some embodiments comprise a targetselected from B2M, CD247, CD3D, CD3E, CD3G, TRAC, TRBC1, TRBC2, HLA-A,HLA-B, HLA-C, DCK, CD52, FKBP1A, CIITA, NLRC5, RFXANK, RFX5, RFXAP, orNR3C1, HAVCR2, LAG3, PDCD1, PD-L2, CTLA4, CEACAM (CEACAM-1, CEACAM-3and/or CEACAM-5), VISTA, BTLA, TIGIT, LAIR1, CD160, 2B4, CD80, CD86,B7-H3 (CD113), B7-H4 (VTCN1), HVEM (TNFRSF14 or CD107), KIR, A2aR, MEWclass I, MEW class II, GALS, adenosine, and TGF beta, or PTPN11 DCK,CD52, NR3C1, LILRB1, CD19; CD123; CD22; CD30; CD171; CS-1 (also referredto as CD2 subset 1, CRACC, SLAMF7, CD319, and 19A24); C-type lectin-likemolecule-1 (CLL-1 or CLECL1); CD33; epidermal growth factor receptorvariant III (EGFRvIII); ganglioside G2 (GD2); ganglioside GD3(aNeu5Ac(2-8)aNeu5Ac(2-3)bDGalp(1-4)bDGlcp(1-1)Cer); TNF receptor familymember B cell maturation (BCMA); Tn antigen ((Tn Ag) or(GalNAca-Ser/Thr)); prostate-specific membrane antigen (PSMA); Receptortyrosine kinase-like orphan receptor 1 (ROR1); Fms-Like Tyrosine Kinase3 (FLT3); Tumor-associated glycoprotein 72 (TAG72); CD38; CD44v6;Carcinoembryonic antigen (CEA); Epithelial cell adhesion molecule(EPCAM); B7H3 (CD276); KIT (CD117); Interleukin-13 receptor subunitalpha-2 (IL-13Ra2 or CD213A2); Mesothelin; Interleukin 11 receptor alpha(IL-11Ra); prostate stem cell antigen (PSCA); Protease Serine 21(Testisin or PRSS21); vascular endothelial growth factor receptor 2(VEGFR2); Lewis(Y) antigen; CD24; Platelet-derived growth factorreceptor beta (PDGFR-beta); Stage-specific embryonic antigen-4 (SSEA-4);CD20; Folate receptor alpha; Receptor tyrosine-protein kinase ERBB2(Her2/neu); n kinase ERBB2 (Her2/neu); Mucin 1, cell surface associated(MUC1); epidermal growth factor receptor (EGFR); neural cell adhesionmolecule (NCAM); Prostase; prostatic acid phosphatase (PAP); elongationfactor 2 mutated (ELF2M); Ephrin B2; fibroblast activation protein alpha(FAP); insulin-like growth factor 1 receptor (IGF-I receptor), carbonicanhydrase IX (CAIX); Proteasome (Prosome, Macropain) Subunit, Beta Type,9 (LMP2); glycoprotein 100 (gp100); oncogene fusion protein consistingof breakpoint cluster region (BCR) and Abelson murine leukemia viraloncogene homolog 1 (Abl) (bcr-abl); tyrosinase; ephrin type-A receptor 2(EphA2); Fucosyl GM1; sialyl Lewis adhesion molecule (sLe); gangliosideGM3 (aNeu5Ac(2-3)bDGalp(1-4)bDGlcp(1-1)Cer); transglutaminase 5 (TGS5);high molecular weight-melanoma-associated antigen (HMWMAA); o-acetyl-GD2ganglioside (OAcGD2); Folate receptor beta; tumor endothelial marker 1(TEM1/CD248); tumor endothelial marker 7-related (TEM7R); claudin 6(CLDN6); thyroid stimulating hormone receptor (TSHR); G protein-coupledreceptor class C group 5, member D (GPRC5D); chromosome X open readingframe 61 (CXORF61); CD97; CD179a; anaplastic lymphoma kinase (ALK);Polysialic acid; placenta-specific 1 (PLAC1); hexasaccharide portion ofgloboH glycoceramide (GloboH); mammary gland differentiation antigen(NY-BR-1); uroplakin 2 (UPK2); Hepatitis A virus cellular receptor 1(HAVCR1); adrenoceptor beta 3 (ADRB3); pannexin 3 (PANX3); Gprotein-coupled receptor 20 (GPR20); lymphocyte antigen 6 complex, locusK 9 (LY6K); Olfactory receptor 51E2 (OR51E2); TCR Gamma AlternateReading Frame Protein (TARP); Wilms tumor protein (WT1); Cancer/testisantigen 1 (NY-ESO-1); Cancer/testis antigen 2 (LAGE-1a);Melanoma-associated antigen 1 (MAGE-A1); ETS translocation-variant gene6, located on chromosome 12p (ETV6-AML); sperm protein 17 (SPA17); XAntigen Family, Member 1A (XAGE1); angiopoietin-binding cell surfacereceptor 2 (Tie 2); melanoma cancer testis antigen-1 (MAD-CT-1);melanoma cancer testis antigen-2 (MAD-CT-2); Fos-related antigen 1;tumor protein p53 (p53); p53 mutant; prostein; surviving; telomerase;prostate carcinoma tumor antigen-1 (PCTA-1 or Galectin 8), melanomaantigen recognized by T cells 1 (MelanA or MART1); Rat sarcoma (Ras)mutant; human Telomerase reverse transcriptase (hTERT); sarcomatranslocation breakpoints; melanoma inhibitor of apoptosis (ML-IAP); ERG(transmembrane protease, serine 2 (TMPRSS2) ETS fusion gene); N-Acetylglucosaminyl-transferase V (NA17); paired box protein Pax-3 (PAX3);Androgen receptor; Cyclin B1; v-myc avian myelocytomatosis viraloncogene neuroblastoma derived homolog (MYCN); Ras Homolog Family MemberC (RhoC); Tyrosinase-related protein 2 (TRP-2); Cytochrome P450 1B1(CYP1B1); CCCTC-Binding Factor (Zinc Finger Protein)-Like (BORIS orBrother of the Regulator of Imprinted Sites), Squamous Cell CarcinomaAntigen Recognized By T Cells 3 (SART3); Paired box protein Pax-5(PAXS); proacrosin binding protein sp32 (OY-TES1); lymphocyte-specificprotein tyrosine kinase (LCK); A kinase anchor protein 4 (AKAP-4);synovial sarcoma, X breakpoint 2 (SSX2); Receptor for Advanced GlycationEndproducts (RAGE-1); renal ubiquitous 1 (RU1); renal ubiquitous 2(RU2); legumain; human papilloma virus E6 (HPV E6); human papillomavirus E7 (HPV E7); intestinal carboxyl esterase; heat shock protein 70-2mutated (mut hsp70-2); CD79a; CD79b; CD72; Leukocyte-associatedimmunoglobulin-like receptor 1 (LAIR1); Fc fragment of IgA receptor(FCAR or CD89); Leukocyte immunoglobulin-like receptor subfamily Amember 2 (LILRA2); CD300 molecule-like family member f (CD300LF); C-typelectin domain family 12 member A (CLEC12A); bone marrow stromal cellantigen 2 (BST2); EGF-like module-containing mucin-like hormonereceptor-like 2 (EMR2); lymphocyte antigen 75 (LY75); Glypican-3 (GPC3);Fc receptor-like 5 (FCRLS); and immunoglobulin lambda-like polypeptide 1(IGLL1), CD19, BCMA, CD70, G6PC, Dystrophin, including modification ofexon 51 by deletion or excision, DMPK, CFTR (cystic fibrosistransmembrane conductance regulator). In embodiments, the targetscomprise CD70, or a Knock-in of CD33 and Knock-out of B2M. Inembodiments, the targets comprise a knockout of TRAC and B2M, or TRACB2M and PD1, with or without additional target genes. In certainembodiments, the disease is cystic fibrosis with targeting of the SCNN1Agene, e.g., the non-coding or coding regions, e.g., a promoter region,or a transcribed sequence, e.g., intronic or exonic sequence, targetedknock-in at CFTR sequence within intron 2, into which, e.g., can beintroduced CFTR sequence that codes for CFTR exons 3-27; and sequencewithin CFTR intron 10, into which sequence that codes for CFTR exons11-27 can be introduced.

In embodiments, the disease is Metachromatic Leukodystrophy, and thetarget is Arylsulfatase A, the disease is Wiskott-Aldrich Syndrome andthe target is Wiskott-Aldrich Syndrome protein, the disease is Adrenoleukodystrophy and the target is ATP-binding cassette DI, the disease isHuman Immunodeficiency Virus and the target is receptor type 5-C—Cchemokine or CXCR4 gene, the disease is Beta-thalassemia and the targetis Hemoglobin beta subunit, the disease is X-linked Severe Combined IDreceptor subunit gamma and the target is interleukin-2 receptor subunitgamma, the disease is Multisystemic Lysosomal Storage Disordercystinosis and the target is cystinosin, the disease is Diamon-Blackfananemia and the target is Ribosomal protein S19, the disease is FanconiAnemia and the target is Fanconi anemia complementation groups (e.g.FNACA, FNACB, FANCC, FANCD1, FANCD2, FANCE, FANCF, RAD51C), the diseaseis Shwachman-Bodian-Diamond Bodian-Diamond syndrome and the target isShwachman syndrome gene, the disease is Gaucher's disease and the targetis Glucocerebrosidase, the disease is Hemophilia A and the target isAnti-hemophiliac factor OR Factor VIII, Christmas factor, Serineprotease, Factor Hemophilia B IX, the disease is Adenosine deaminasedeficiency (ADA-SCID) and the target is Adenosine deaminase, the diseaseis GM1 gangliosidoses and the target is beta-galactosidase, the diseaseis Glycogen storage disease type II, Pompe disease, the disease is acidmaltase deficiency acid and the target is alpha-glucosidase, the diseaseis Niemann-Pick disease, SMPD1-associated (Types Sphingomyelinphosphodiesterase 1 OR A and B) acid and the target is sphingomyelinase,the disease is Krabbe disease, globoid cell leukodystrophy and thetarget is Galactosylceramidase or galactosylceramide lipidosis and thetarget is galactercerebrosidease, Human leukocyte antigens DR-15, DQ-6,the disease is Multiple Sclerosis (MS) DRB1, the disease is HerpesSimplex Virus 1 or 2 and the target is knocking down of one, two orthree of RS1, RL2 and/or LAT genes. In embodiments, the disease is anHPV associated cancer with treatment including edited cells comprisingbinding molecules, such as TCRs or antigen binding fragments thereof andantibodies and antigen-binding fragments thereof, such as those thatrecognize or bind human papilloma virus. The disease can be Hepatitis Bwith a target of one or more of PreC, C, X, PreS1, PreS2, S, P and/or SPgene(s).

In embodiments, the immune disease is severe combined immunodeficiency(SCID), Omenn syndrome, and in one aspect the target is RecombinationActivating Gene 1 (RAG1) or an interleukin-7 receptor (IL7R). Inparticular embodiments, the disease is Transthyretin Amyloidosis (ATTR),Familial amyloid cardiomyopathy, and in one aspect, the target is theTTR gene, including one or more mutations in the TTR gene. Inembodiments, the disease is Alpha-1 Antitrypsin Deficiency (AATD) oranother disease in which Alpha-1 Antitrypsin is implicated, for exampleGvHD, Organ transplant rejection, diabetes, liver disease, COPD,Emphysema and Cystic Fibrosis, in particular embodiments, the target isSERPINA1.

In embodiments, the disease is primary hyperoxaluria, which, in certainembodiments, the target comprises one or more of Lactate dehydrogenase A(LDHA) and hydroxy Acid Oxidase 1 (HAO1). In embodiments, the disease isprimary hyperoxaluria type 1 (ph1) and other alanine-glyoxylateaminotransferase (agxt) gene related conditions or disorders, such asAdenocarcinoma, Chronic Alcoholic Intoxication, Alzheimer's Disease,Cooley's anemia, Aneurysm, Anxiety Disorders, Asthma, Malignant neoplasmof breast, Malignant neoplasm of skin, Renal Cell Carcinoma,Cardiovascular Diseases, Malignant tumor of cervix, CoronaryArteriosclerosis, Coronary heart disease, Diabetes, Diabetes Mellitus,Diabetes Mellitus Non-Insulin-Dependent, Diabetic Nephropathy,Eclampsia, Eczema, Subacute Bacterial Endocarditis, Glioblastoma,Glycogen storage disease type II, Sensorineural Hearing Loss (disorder),Hepatitis, Hepatitis A, Hepatitis B, Homocystinuria, Hereditary SensoryAutonomic Neuropathy Type 1, Hyperaldosteronism, Hypercholesterolemia,Hyperoxaluria, Primary Hyperoxaluria, Hypertensive disease, InflammatoryBowel Diseases, Kidney Calculi, Kidney Diseases, Chronic Kidney Failure,leiomyosarcoma, Metabolic Diseases, Inborn Errors of Metabolism, MitralValve Prolapse Syndrome, Myocardial Infarction, Neoplasm Metastasis,Nephrotic Syndrome, Obesity, Ovarian Diseases, Periodontitis, PolycysticOvary Syndrome, Kidney Failure, Adult Respiratory Distress Syndrome,Retinal Diseases, Cerebrovascular accident, Turner Syndrome, Viralhepatitis, Tooth Loss, Premature Ovarian Failure, EssentialHypertension, Left Ventricular Hypertrophy, Migraine Disorders,Cutaneous Melanoma, Hypertensive heart disease, Chronicglomerulonephritis, Migraine with Aura, Secondary hypertension, Acutemyocardial infarction, Atherosclerosis of aorta, Allergic asthma,pineoblastoma, Malignant neoplasm of lung, Primary hyperoxaluria type I,Primary hyperoxaluria type 2, Inflammatory Breast Carcinoma, Cervixcarcinoma, Restenosis, Bleeding ulcer, Generalized glycogen storagedisease of infants, Nephrolithiasis, Chronic rejection of renaltransplant, Urolithiasis, pricking of skin, Metabolic Syndrome X,Maternal hypertension, Carotid Atherosclerosis, Carcinogenesis, BreastCarcinoma, Carcinoma of lung, Nephronophthisis, Microalbuminuria,Familial Retinoblastoma, Systolic Heart Failure Ischemic stroke, Leftventricular systolic dysfunction, Cauda Equina Paraganglioma,Hepatocarcinogenesis, Chronic Kidney Diseases, Glioblastoma Multiforme,Non-Neoplastic Disorder, Calcium Oxalate Nephrolithiasis,Ablepharon-Macrostomia Syndrome, Coronary Artery Disease, Livercarcinoma, Chronic kidney disease stage 5, Allergic rhinitis (disorder),Crigler Najjar syndrome type 2, and Ischemic Cerebrovascular Accident.In certain embodiments, treatment is targeted to the liver. Inembodiments, the gene is AGXT, with a cytogenetic location of 2q37.3 andthe genomic coordinate are on Chromosome 2 on the forward strand atposition 240,868,479-240,880,502.

Treatment can also target collagen type vii alpha 1 chain (col7a1) generelated conditions or disorders, such as Malignant neoplasm of skin,Squamous cell carcinoma, Colorectal Neoplasms, Crohn Disease,Epidermolysis Bullosa, Indirect Inguinal Hernia, Pruritus,Schizophrenia, Dermatologic disorders, Genetic Skin Diseases, Teratoma,Cockayne-Touraine Disease, Epidermolysis Bullosa Acquisita,Epidermolysis Bullosa Dystrophica, Junctional Epidermolysis Bullosa,Hallopeau-Siemens Disease, Bullous Skin Diseases, Agenesis of corpuscallosum, Dystrophia unguium, Vesicular Stomatitis, EpidermolysisBullosa With Congenital Localized Absence Of Skin And Deformity OfNails, Juvenile Myoclonic Epilepsy, Squamous cell carcinoma ofesophagus, Poikiloderma of Kindler, pretibial Epidermolysis bullosa,Dominant dystrophic epidermolysis bullosa albopapular type (disorder),Localized recessive dystrophic epidermolysis bullosa, Generalizeddystrophic epidermolysis bullosa, Squamous cell carcinoma of skin,Epidermolysis Bullosa Pruriginosa, Mammary Neoplasms, EpidermolysisBullosa Simplex Superficialis, Isolated Toenail Dystrophy, Transientbullous dermolysis of the newborn, Autosomal Recessive EpidermolysisBullosa Dystrophica Localisata Variant, and Autosomal RecessiveEpidermolysis Bullosa Dystrophica Inversa.

In embodiments, the disease is acute myeloid leukemia (AML), targetingWilms Tumor I (WTI) and HLA expressing cells. In embodiments, thetherapy is T cell therapy, as described elsewhere herein, comprisingengineered T cells with WTI specific TCRs. In certain embodiments, thetarget is CD157 in AML.

In embodiments, the disease is a blood disease. In certain embodiments,the disease is hemophilia, in one aspect the target is Factor XI. Inother embodiments, the disease is a hemoglobinopathy, such as sicklecell disease, sickle cell trait, hemoglobin C disease, hemoglobin Ctrait, hemoglobin S/C disease, hemoglobin D disease, hemoglobin Edisease, a thalassemia, a condition associated with hemoglobin withincreased oxygen affinity, a condition associated with hemoglobin withdecreased oxygen affinity, unstable hemoglobin disease,methemoglobinemia. Hemostasis and Factor X and XII deficiencies can alsobe treated. In embodiments, the target is BCL11A gene (e.g., a humanBCL11a gene), a BCL11a enhancer (e.g., a human BCL11a enhancer), or aHFPH region (e.g., a human HPFH region), beta globulin, fetalhemoglobin, γ-globin genes (e.g., HBG1, HBG2, or HBG1 and HBG2), theerythroid specific enhancer of the BCL11A gene (BCL11Ae), or acombination thereof.

In embodiments, the target locus can be one or more of RAC, TRBC1,TRBC2, CD3E, CD3G, CD3D, B2M, CIITA, CD247, HLA-A, HLA-B, HLA-C, DCK,CD52, FKBP1A, NLRC5, RFXANK, RFX5, RFXAP, NR3C1, CD274, HAVCR2, LAG3,PDCD1, PD-L2, HCF2, PAI, TFPI, PLAT, PLAU, PLG, RPOZ, F7, F8, F9, F2,F5, F7, F10, F11, F12, F13A1, F13B, STAT1, FOXP3, IL2RG, DCLRE1C, ICOS,MHC2TA, GALNS, HGSNAT, ARSB, RFXAP, CD20, CD81, TNFRSF13B, SEC23B, PKLR,IFNG, SPTB, SPTA, SLC4A1, EPO, EPB42, CSF2 CSF3, VFW, SERPINCA1, CTLA4,CEACAM (e.g., CEACAM-1, CEACAM-3 and/or CEACAM-5), VISTA, BTLA, TIGIT,LAIR1, CD160, 2B4, CD80, CD86, B7-H3 (CD113), B7-H4 (VTCN1), HVEM(TNFRSF14 or CD107), KIR, A2aR, MEW class I, MEW class II, GALS,adenosine, and TGF beta, PTPN11, and combinations thereof. Inembodiments, the target sequence within the genomic nucleic acidsequence at Chr1 1:5,250,094-5,250,237,—strand, hg38; Chr11:5,255,022-5,255,164,—strand, hg38; nondeletional HFPH region; Chr11:5,249,833 to Chr1 1:5,250,237,—strand, hg38; Chr1 1:5,254,738 to Chr11:5,255, 164,—strand, hg38; Chr1 1:5,249,833-5,249,927,—strand, hg3;Chr1 1:5,254,738-5,254,851,—strand, hg38; Chr1 1:5,250,139-5,250,237,—strand, hg38.

In embodiments, the disease is associated with high cholesterol, andregulation of cholesterol is provided, in some embodiments, regulationis effected by modification in the target PCSK9. Other diseases in whichPCSK9 can be implicated, and thus would be a target for the systems andmethods described herein include Abetaiipoproteinemia, Adenoma,Arteriosclerosis, Atherosclerosis, Cardiovascular Diseases,Cholelithiasis, Coronary Arteriosclerosis, Coronary heart disease,Non-Insulin-Dependent Diabetes Meliitus, Hypercholesterolemia, FamilialHypercholesterolemia, Hyperinsuiinism, Hyperlipidemia, Familial CombinedHyperlipidemia, Hypobetalipoproteinemias, Chronic Kidney Failure, Liverdiseases, Liver neoplasms, melanoma, Myocardial Infarction, Narcolepsy,Neoplasm Metastasis, Nephroblastoma, Obesity, Peritonitis,Pseudoxanthoma Elasticum, Cerebrovascular accident, Vascular Diseases,Xanthomatosis, Peripheral Vascular Diseases, Myocardial Ischemia,Dyslipidemias, Impaired glucose tolerance, Xanthoma, Polygenichypercholesterolemia, Secondary malignant neoplasm of liver, Dementia,Overweight, Hepatitis C, Chronic, Carotid Atherosclerosis,Hyperlipoproteinemia Type Ha, Intracranial Atherosclerosis, Ischemicstroke, Acute Coronary Syndrome, Aortic calcification, Cardiovascularmorbidity, Hyperlipoproteinemia Type lib, Peripheral Arterial Diseases,Familial Hyperaldosteronism Type II, Familial hypobetalipoproteinemia,Autosomal Recessive Hypercholesterolemia, Autosomal DominantHypercholesterolemia 3, Coronary Artery Disease, Liver carcinoma,Ischemic Cerebrovascular Accident, and Arteriosclerotic cardiovasculardisease NOS. In embodiments, the treatment can be targeted to the liver,the primary location of activity of PCSK9.

In embodiments, the disease or disorder is Hyper IGM syndrome or adisorder characterized by defective CD40 signaling. In certainembodiments, the insertion of CD40L exons are used to restore properCD40 signaling and B cell class switch recombination. In particularembodiments, the target is CD40 ligand (CD40L)-edited at one or more ofexons 2-5 of the CD40L gene, in cells, e.g., T cells or hematopoieticstem cells (HSCs).

In embodiments, the disease is merosin-deficient congenital musculardystrophy (mdcmd) and other laminin, alpha 2 (lama2) gene relatedconditions or disorders. The therapy can be targeted to the muscle, forexample, skeletal muscle, smooth muscle, and/or cardiac muscle. Incertain embodiments, the target is Laminin, Alpha 2 (LAMA2) which mayalso be referred to as Laminin-12 Subunit Alpha, Laminin-2 SubunitAlpha, Laminin-4 Subunit Alpha 3, Merosin Heavy Chain, Laminin M Chain,LAMM, Congenital Muscular Dystrophy and Merosin. LAMA2 has a cytogeneticlocation of 6q22.33 and the genomic coordinate are on Chromosome 6 onthe forward strand at position 128,883, 141-129,516,563. In embodiments,the disease treated can be Merosin-Deficient Congenital MuscularDystrophy (MDCMD), Amyotrophic Lateral Sclerosis, Bladder Neoplasm,Charcot-Marie-Tooth Disease, Colorectal Carcinoma, Contracture, Cyst,Duchenne Muscular Dystrophy, Fatigue, Hyperopia, RenovascularHypertension, melanoma, Mental Retardation, Myopathy, MuscularDystrophy, Myopia, Myositis, Neuromuscular Diseases, PeripheralNeuropathy, Refractive Errors, Schizophrenia, Severe mental retardation(I.Q. 20-34), Thyroid Neoplasm, Tobacco Use Disorder, Severe CombinedImmunodeficiency, Synovial Cyst, Adenocarcinoma of lung (disorder),Tumor Progression, Strawberry nevus of skin, Muscle degeneration,Microdontia (disorder), Walker-Warburg congenital muscular dystrophy,Chronic Periodontitis, Leukoencephalopathies, Impaired cognition,Fukuyama Type Congenital Muscular Dystrophy, Scleroatonic musculardystrophy, Eichsfeld type congenital muscular dystrophy, Neuropathy,Muscle eye brain disease, Limb-Muscular Dystrophies, Girdle, Congenitalmuscular dystrophy (disorder), Muscle fibrosis, cancer recurrence, DrugResistant Epilepsy, Respiratory Failure, Myxoid cyst, Abnormalbreathing, Muscular dystrophy congenital merosin negative, ColorectalCancer, Congenital Muscular Dystrophy due to Partial LAMA2 Deficiency,and Autosomal Dominant Craniometaphyseal Dysplasia.

In certain embodiments, the target is an AAVS1 (PPPIR12C), an ALB gene,an Angpt13 gene, an ApoC3 gene, an ASGR2 gene, a CCR5 gene, a FIX (F9)gene, a G6PC gene, a Gys2 gene, an HGD gene, a Lp(a) gene, a Pcsk9 gene,a Serpinal gene, a TF gene, and a TTR gene). Assessment of efficiency ofHDR/NHEJ mediated knock-in of cDNA into the first exon can utilize cDNAknock-in into “safe harbor” sites such as: single-stranded ordouble-stranded DNA having homologous arms to one of the followingregions, for example: ApoC3 (chr11:116829908-116833071), Angpt13(chr1:62,597,487-62,606,305), Serpinal (chr14:94376747-94390692), Lp(a)(chr6:160531483-160664259), Pcsk9 (chr1:55,039,475-55,064,852), FIX(chrX:139,530,736-139,563,458), ALB (chr4:73,404,254-73,421,411), TTR(chr1 8:31,591,766-31,599,023), TF (chr3:133,661,997-133,779,005), G6PC(chr17:42,900,796-42,914,432), Gys2 (chr12:21,536,188-21,604,857), AAVS1(PPP1R12C) (chr19:55,090,912-55,117,599), HGD(chr3:120,628,167-120,682,570), CCR5 (chr3:46,370,854-46,376,206), orASGR2 (chr17:7,101,322-7,114,310).

In one aspect, the target is superoxide dismutase 1, soluble (SOD1),which can aid in treatment of a disease or disorder associated with thegene. In particular embodiments, the disease or disorder is associatedwith SOD1, and can be, for example, Adenocarcinoma, Albuminuria, ChronicAlcoholic Intoxication, Alzheimer's Disease, Amnesia, Amyloidosis,Amyotrophic Lateral Sclerosis, Anemia, Autoimmune hemolytic anemia,Sickle Cell Anemia, Anoxia, Anxiety Disorders, Aortic Diseases,Arteriosclerosis, Rheumatoid Arthritis, Asphyxia Neonatorum, Asthma,Atherosclerosis, Autistic Disorder, Autoimmune Diseases, BarrettEsophagus, Behcet Syndrome, Malignant neoplasm of urinary bladder, BrainNeoplasms, Malignant neoplasm of breast, Oral candidiasis, Malignanttumor of colon, Bronchogenic Carcinoma, Non-Small Cell Lung Carcinoma,Squamous cell carcinoma, Transitional Cell Carcinoma, CardiovascularDiseases, Carotid Artery Thrombosis, Neoplastic Cell Transformation,Cerebral Infarction, Brain Ischemia, Transient Ischemic Attack,Charcot-Marie-Tooth Disease, Cholera, Colitis, Colorectal Carcinoma,Coronary Arteriosclerosis, Coronary heart disease, Infection byCryptococcus neoformans, Deafness, Cessation of life, DeglutitionDisorders, Presenile dementia, Depressive disorder, Contact Dermatitis,Diabetes, Diabetes Mellitus, Experimental Diabetes Mellitus,Insulin-Dependent Diabetes Mellitus, Non-Insulin-Dependent DiabetesMellitus, Diabetic Angiopathies, Diabetic Nephropathy, DiabeticRetinopathy, Down Syndrome, Dwarfism, Edema, Japanese Encephalitis,Toxic Epidermal Necrolysis, Temporal Lobe Epilepsy, Exanthema, Muscularfasciculation, Alcoholic Fatty Liver, Fetal Growth Retardation,Fibromyalgia, Fibrosarcoma, Fragile X Syndrome, Giardiasis,Glioblastoma, Glioma, Headache, Partial Hearing Loss, Cardiac Arrest,Heart failure, Atrial Septal Defects, Helminthiasis, Hemochromatosis,Hemolysis (disorder), Chronic Hepatitis, HIV Infections, HuntingtonDisease, Hypercholesterolemia, Hyperglycemia, Hyperplasia, Hypertensivedisease, Hyperthyroidism, Hypopituitarism, Hypoproteinemia, Hypotension,natural Hypothermia, Hypothyroidism, Immunologic Deficiency Syndromes,Immune System Diseases, Inflammation, Inflammatory Bowel Diseases,Influenza, Intestinal Diseases, Ischemia, Kearns-Sayre syndrome,Keratoconus, Kidney Calculi, Kidney Diseases, Acute Kidney Failure,Chronic Kidney Failure, Polycystic Kidney Diseases, leukemia, MyeloidLeukemia, Acute Promyelocytic Leukemia, Liver Cirrhosis, Liver diseases,Liver neoplasms, Locked-In Syndrome, Chronic Obstructive Airway Disease,Lung Neoplasms, Systemic Lupus Erythematosus, Non-Hodgkin Lymphoma,Machado-Joseph Disease, Malaria, Malignant neoplasm of stomach, AnimalMammary Neoplasms, Marfan Syndrome, Meningomyelocele, MentalRetardation, Mitral Valve Stenosis, Acquired Dental Fluorosis, MovementDisorders, Multiple Sclerosis, Muscle Rigidity, Muscle Spasticity,Muscular Atrophy, Spinal Muscular Atrophy, Myopathy, Mycoses, MyocardialInfarction, Myocardial Reperfusion Injury, Necrosis, Nephrosis,Nephrotic Syndrome, Nerve Degeneration, nervous system disorder,Neuralgia, Neuroblastoma, Neuroma, Neuromuscular Diseases, Obesity,Occupational Diseases, Ocular Hypertension, Oligospermia, Degenerativepolyarthritis, Osteoporosis, Ovarian Carcinoma, Pain, Pancreatitis,Papillon-Lefevre Disease, Paresis, Parkinson Disease, Phenylketonurias,Pituitary Diseases, Pre-Eclampsia, Prostatic Neoplasms, ProteinDeficiency, Proteinuria, Psoriasis, Pulmonary Fibrosis, Renal ArteryObstruction, Reperfusion Injury, Retinal Degeneration, Retinal Diseases,Retinoblastoma, Schistosomiasis, Schistosomiasis mansoni, Schizophrenia,Scrapie, Seizures, Age-related cataract, Compression of spinal cord,Cerebrovascular accident, Subarachnoid Hemorrhage, Progressivesupranuclear palsy, Tetanus, Trisomy, Turner Syndrome, UnipolarDepression, Urticaria, Vitiligo, Vocal Cord Paralysis, IntestinalVolvulus, Weight Gain, HMN (Hereditary Motor Neuropathy) Proximal TypeI, Holoprosencephaly, Motor Neuron Disease, Neurofibrillary degeneration(morphologic abnormality), Burning sensation, Apathy, Mood swings,Synovial Cyst, Cataract, Migraine Disorders, Sciatic Neuropathy, Sensoryneuropathy, Atrophic condition of skin, Muscle Weakness, Esophagealcarcinoma, Lingual-Facial-Buccal Dyskinesia, Idiopathic pulmonaryhypertension, Lateral Sclerosis, Migraine with Aura, MixedConductive-Sensorineural Hearing Loss, Iron deficiency anemia,Malnutrition, Prion Diseases, Mitochondrial Myopathies, MELAS Syndrome,Chronic progressive external ophthalmoplegia, General Paralysis,Premature aging syndrome, Fibrillation, Psychiatric symptom, Memoryimpairment, Muscle degeneration, Neurologic Symptoms, Gastrichemorrhage, Pancreatic carcinoma, Pick Disease of the Brain, LiverFibrosis, Malignant neoplasm of lung, Age related macular degeneration,Parkinsonian Disorders, Disease Progression, Hypocupremia, Cytochrome-cOxidase Deficiency, Essential Tremor, Familial Motor Neuron Disease,Lower Motor Neuron Disease, Degenerative myelopathy, DiabeticPolyneuropathies, Liver and Intrahepatic Biliary Tract Carcinoma,Persian Gulf Syndrome, Senile Plaques, Atrophic, Frontotemporaldementia, Semantic Dementia, Common Migraine, Impaired cognition,Malignant neoplasm of liver, Malignant neoplasm of pancreas, Malignantneoplasm of prostate, Pure Autonomic Failure, Motor symptoms, Spastic,Dementia, Neurodegenerative Disorders, Chronic Hepatitis C, Guam FormAmyotrophic Lateral Sclerosis, Stiff limbs, Multisystem disorder, Lossof scalp hair, Prostate carcinoma, Hepatopulmonary Syndrome, HashimotoDisease, Progressive Neoplastic Disease, Breast Carcinoma, Terminalillness, Carcinoma of lung, Tardive Dyskinesia, Secondary malignantneoplasm of lymph node, Colon Carcinoma, Stomach Carcinoma, Centralneuroblastoma, Dissecting aneurysm of the thoracic aorta, Diabeticmacular edema, Microalbuminuria, Middle Cerebral Artery Occlusion,Middle Cerebral Artery Infarction, Upper motor neuron signs,Frontotemporal Lobar Degeneration, Memory Loss, Classicalphenylketonuria, CADASIL Syndrome, Neurologic Gait Disorders,Spinocerebellar Ataxia Type 2, Spinal Cord Ischemia, Lewy Body Disease,Muscular Atrophy, Spinobulbar, Chromosome 21 monosomy, Thrombocytosis,Spots on skin, Drug-Induced Liver Injury, Hereditary Leber OpticAtrophy, Cerebral Ischemia, ovarian neoplasm, Tauopathies,Macroangiopathy, Persistent pulmonary hypertension, Malignant neoplasmof ovary, Myxoid cyst, Drusen, Sarcoma, Weight decreased, MajorDepressive Disorder, Mild cognitive disorder, Degenerative disorder,Partial Trisomy, Cardiovascular morbidity, hearing impairment, Cognitivechanges, Ureteral Calculi, Mammary Neoplasms, Colorectal Cancer, ChronicKidney Diseases, Minimal Change Nephrotic Syndrome, Non-NeoplasticDisorder, X-Linked Bulbo-Spinal Atrophy, Mammographic Density, NormalTension Glaucoma Susceptibility To Finding), Vitiligo-AssociatedMultiple Autoimmune Disease Susceptibility 1 (Finding), AmyotrophicLateral Sclerosis And/Or Frontotemporal Dementia 1, Amyotrophic LateralSclerosis 1, Sporadic Amyotrophic Lateral Sclerosis, monomelicAmyotrophy, Coronary Artery Disease, Transformed migraine,Regurgitation, Urothelial Carcinoma, Motor disturbances, Livercarcinoma, Protein Misfolding Disorders, TDP-43 Proteinopathies,Promyelocytic leukemia, Weight Gain Adverse Event, Mitochondrialcytopathy, Idiopathic pulmonary arterial hypertension, ProgressivecGVHD, Infection, GRN-related frontotemporal dementia, Mitochondrialpathology, and Hearing Loss.

In particular embodiments, the disease is associated with the geneATXN1, ATXN2, or ATXN3, which may be targeted for treatment. In someembodiments, the CAG repeat region located in exon 8 of ATXN1, exon 1 ofATXN2, or exon 10 of the ATXN3 is targeted. In embodiments, the diseaseis spinocerebellar ataxia 3 (sca3), sca1, or sca2 and other relateddisorders, such as Congenital Abnormality, Alzheimer's Disease,Amyotrophic Lateral Sclerosis, Ataxia, Ataxia Telangiectasia, CerebellarAtaxia, Cerebellar Diseases, Chorea, Cleft Palate, Cystic Fibrosis,Mental Depression, Depressive disorder, Dystonia, Esophageal Neoplasms,Exotropia, Cardiac Arrest, Huntington Disease, Machado-Joseph Disease,Movement Disorders, Muscular Dystrophy, Myotonic Dystrophy, Narcolepsy,Nerve Degeneration, Neuroblastoma, Parkinson Disease, PeripheralNeuropathy, Restless Legs Syndrome, Retinal Degeneration, RetinitisPigmentosa, Schizophrenia, Shy-Drager Syndrome, Sleep disturbances,Hereditary Spastic Paraplegia, Thromboembolism, Stiff-Person Syndrome,Spinocerebellar Ataxia, Esophageal carcinoma, Polyneuropathy, Effects ofheat, Muscle twitch, Extrapyramidal sign, Ataxic, Neurologic Symptoms,Cerebral atrophy, Parkinsonian Disorders, Protein S Deficiency,Cerebellar degeneration, Familial Amyloid Neuropathy Portuguese Type,Spastic syndrome, Vertical Nystagmus, Nystagmus End-Position,Antithrombin III Deficiency, Atrophic, Complicated hereditary spasticparaplegia, Multiple System Atrophy, Pallidoluysian degeneration,Dystonia Disorders, Pure Autonomic Failure, Thrombophilia, Protein C,Deficiency, Congenital Myotonic Dystrophy, Motor symptoms, Neuropathy,Neurodegenerative Disorders, Malignant neoplasm of esophagus, Visualdisturbance, Activated Protein C Resistance, Terminal illness, Myokymia,Central neuroblastoma, Dyssomnias, Appendicular Ataxia,Narcolepsy-Cataplexy Syndrome, Machado-Joseph Disease Type I,Machado-Joseph Disease Type II, Machado-Joseph Disease Type III,Dentatorubral-Pallidoluysian Atrophy, Gait Ataxia, SpinocerebellarAtaxia Type 1, Spinocerebellar Ataxia Type 2, Spinocerebellar AtaxiaType 6 (disorder), Spinocerebellar Ataxia Type 7, Muscular SpinobulbarAtrophy, Genomic Instability, Episodic ataxia type 2 (disorder),Bulbo-Spinal Atrophy X-Linked, Fragile X Tremor/Ataxia Syndrome,Thrombophilia Due to Activated Protein C Resistance (Disorder),Amyotrophic Lateral Sclerosis 1, Neuronal Intranuclear InclusionDisease, Hereditary Antithrombin Iii Deficiency, and Late-OnsetParkinson Disease.

In embodiments, the disease is associated with expression of a tumorantigen-cancer or non-cancer related indication, for example acutelymphoid leukemia, diffuse large B cell lymphoma, follicular lymphoma,chronic lymphocytic leukemia, Hodgkin lymphoma, non-Hodgkin lymphoma. Inembodiments, the target can be TET2 intron, a TET2 intron-exon junction,a sequence within a genomic region of chr4.

In embodiments, neurodegenerative diseases can be treated. In particularembodiments, the target is Synuclein, Alpha (SNCA). In certainembodiments, the disorder treated is a pain related disorder, includingcongenital pain insensitivity, Compressive Neuropathies, ParoxysmalExtreme Pain Disorder, High grade atrioventricular block, Small FiberNeuropathy, and Familial Episodic Pain Syndrome 2. In certainembodiments, the target is Sodium Channel, Voltage Gated, Type X AlphaSubunit (SCNIOA).

In certain embodiments, hematopoietic stem cells and progenitor stemcells are edited, including knock-ins. In particular embodiments, theknock-in is for treatment of lysosomal storage diseases, glycogenstorage diseases, mucopolysaccharoidoses, or any disease in which thesecretion of a protein will ameliorate the disease. In one embodiment,the disease is sickle cell disease (SCD). In another embodiment, thedisease is β-thalessemia.

In certain embodiments, the T cell or NK cell is used for cancertreatment and may include T cells comprising the recombinant receptor(e.g. CAR) and one or more phenotypic markers selected from CCR7+,4-1BB+(CD137+), TIM3+, CD27+, CD62L+, CD127+, CD45RA+, CD45RO−,t-betl'w, IL-7Ra+, CD95+, IL-2RP+, CXCR3+ or LFA-1+. In certainembodiments the editing of a T cell for caner immunotherapy comprisesaltering one or more T-cell expressed gene, e.g., one or more of FAS,BID, CTLA4, PDCD1, CBLB, PTPN6, B2M, TRAC and TRBC gene. In someembodiments, editing includes alterations introduced into, or proximateto, the CBLB target sites to reduce CBLB gene expression in T cells fortreatment of proliferative diseases and may include larger insertions ordeletions at one or more CBLB target sites. T cell editing of TGFBR2target sequence can be, for example, located in exon 3, 4, or 5 of theTGFBR2 gene and utilized for cancers and lymphoma treatment.

Cells for transplantation can be edited and may include allele-specificmodification of one or more immunogenicity genes (e.g., an HLA gene) ofa cell, e.g., HLA-A, HLA-B, HLA-C, HLA-DRB1, HLA-DRB3/4/5, HLA-DQ, andHLA-DP MiHAs, and any other MHC Class I or Class II genes or loci, whichmay include delivery of one or more matched recipient HLA alleles intothe original position(s) where the one or more mismatched donor HLAalleles are located, and may include inserting one or more matchedrecipient HLA alleles into a “safe harbor” locus. In an embodiment, themethod further includes introducing a chemotherapy resistance gene forin vivo selection in a gene.

Methods and systems can target Dystrophia Myotonica-Protein Kinase(DMPK) for editing, in particular embodiments, the target is the CTGtrinucleotide repeat in the 3′ untranslated region (UTR) of the DMPKgene. Disorders or diseases associated with DMPK includeAtherosclerosis, Azoospermia, Hypertrophic Cardiomyopathy, CeliacDisease, Congenital chromosomal disease, Diabetes Mellitus, Focalglomerulosclerosis, Huntington Disease, Hypogonadism, Muscular Atrophy,Myopathy, Muscular Dystrophy, Myotonia, Myotonic Dystrophy,Neuromuscular Diseases, Optic Atrophy, Paresis, Schizophrenia, Cataract,Spinocerebellar Ataxia, Muscle Weakness, Adrenoleukodystrophy,Centronuclear myopathy, Interstitial fibrosis, myotonic musculardystrophy, Abnormal mental state, X-linked Charcot-Marie-Tooth disease1, Congenital Myotonic Dystrophy, Bilateral cataracts (disorder),Congenital Fiber Type Disproportion, Myotonic Disorders, Multisystemdisorder, 3-Methylglutaconic aciduria type 3, cardiac event, CardiogenicSyncope, Congenital Structural Myopathy, Mental handicap,Adrenomyeloneuropathy, Dystrophia myotonica 2, and IntellectualDisability.

In embodiments, the disease is an inborn error of metabolism. Thedisease may be selected from Disorders of Carbohydrate Metabolism(glycogen storage disease, G6PD deficiency), Disorders of Amino AcidMetabolism (phenylketonuria, maple syrup urine disease, glutaricacidemia type 1), Urea Cycle Disorder or Urea Cycle Defects (carbamoylphosphate synthease I deficiency), Disorders of Organic Acid Metabolism(alkaptonuria, 2-hydroxyglutaric acidurias), Disorders of Fatty AcidOxidation/Mitochondrial Metabolism (Medium-chain acyl-coenzyme Adehydrogenase deficiency), Disorders of Porphyrin metabolism (acuteintermittent porphyria), Disorders of Purine/Pyrimidine Metabolism(Lesch-Nynan syndrome), Disorders of Steroid Metabolism (lipoidcongenital adrenal hyperplasia, congenital adrenal hyperplasia),Disorders of Mitochondrial Function (Kearns-Sayre syndrome), Disordersof Peroxisomal function (Zellweger syndrome), or Lysosomal StorageDisorders (Gaucher's disease, Niemann-Pick disease).

In embodiments, the target can comprise Recombination Activating Gene 1(RAG1), BCL11 A, PCSK9, laminin, alpha 2 (lama2), ATXN3,alanine-glyoxylate aminotransferase (AGXT), collagen type vii alpha 1chain (COL7a1), spinocerebellar ataxia type 1 protein (ATXN1),Angiopoietin-like 3 (ANGPTL3), Frataxin (FXN), Superoxidase Dismutase 1,soluble (SOD1), Synuclein, Alpha (SNCA), Sodium Channel, Voltage Gated,Type X Alpha Subunit (SCN10A), Spinocerebellar Ataxia Type 2 Protein(ATXN2), Dystrophia Myotonica-Protein Kinase (DMPK), beta globin locuson chromosome 11, acyl-coenzyme A dehydrogenase for medium chain fattyacids (ACADM), long-chain 3-hydroxyl-coenzyme A dehydrogenase for longchain fatty acids (HADHA), acyl-coenzyme A dehydrogenase for verylong-chain fatty acids (ACADVL), Apolipoprotein C3 (APOCIII),Transthyretin (TTR), Angiopoietin-like 4 (ANGPTL4), Sodium Voltage-GatedChannel Alpha Subunit 9 (SCN9A), Interleukin-7 receptor (IL7R),glucose-6-phosphatase, catalytic (G6PC), haemochromatosis (HFE),SERPINA1, C90RF72, β-globin, dystrophin, γ-globin.

In certain embodiments, the disease or disorder is associated withApolipoprotein C3 (APOCIII), which can be targeted for editing. Inembodiments, the disease or disorder may be Dyslipidemias,Hyperalphalipoproteinemia Type 2, Lupus Nephritis, Wilms Tumor 5, Morbidobesity and spermatogenic, Glaucoma, Diabetic Retinopathy,Arthrogryposis renal dysfunction cholestasis syndrome, CognitionDisorders, Altered response to myocardial infarction, GlucoseIntolerance, Positive regulation of triglyceride biosynthetic process,Renal Insufficiency, Chronic, Hyperlipidemias, Chronic Kidney Failure,Apolipoprotein C-III Deficiency, Coronary Disease, Neonatal DiabetesMellitus, Neonatal, with Congenital Hypothyroidism, HypercholesterolemiaAutosomal Dominant 3, Hyperlipoproteinemia Type III, Hyperthyroidism,Coronary Artery Disease, Renal Artery Obstruction, Metabolic Syndrome X,Hyperlipidemia, Familial Combined, Insulin Resistance, Transientinfantile hypertriglyceridemia, Diabetic Nephropathies, DiabetesMellitus (Type 1), Nephrotic Syndrome Type 5 with or without ocularabnormalities, and Hemorrhagic Fever with renal syndrome.

In certain embodiments, the target is Angiopoietin-like 4(ANGPTL4).Diseases or disorders associated with ANGPTL4 that can be treatedinclude ANGPTL4 is associated with dyslipidemias, low plasmatriglyceride levels, regulator of angiogenesis and modulatetumorigenesis, and severe diabetic retinopathy. both proliferativediabetic retinopathy and non-proliferative diabetic retinopathy.

In embodiments, editing can be used for the treatment of fatty aciddisorders. In certain embodiments, the target is one or more of ACADM,HADHA, ACADVL. In embodiments, the targeted edit is the activity of agene in a cell selected from the acyl-coenzyme A dehydrogenase formedium chain fatty acids (ACADM) gene, the long-chain3-hydroxyl-coenzyme A dehydrogenase for long chain fatty acids (HADHA)gene, and the acyl-coenzyme A dehydrogenase for very long-chain fattyacids (ACADVL) gene. In one aspect, the disease is medium chainacyl-coenzyme A dehydrogenase deficiency (MCADD), long-chain3-hydroxyl-coenzyme A dehydrogenase deficiency (LCHADD), and/or verylong-chain acyl-coenzyme A dehydrogenase deficiency (VLCADD).

Immune Orthogonal Orthologs

In some embodiments, when CRISPR enzymes need to be expressed oradministered in a subject, immunogenicity of CRISPR enzymes may bereduced by sequentially expressing or administering immune orthogonalorthologs of the CRISPR enzymes to the subject. As used herein, the term“immune orthogonal orthologs” refer to orthologous proteins that havesimilar or substantially the same function or activity, but have no orlow cross-reactivity with the immune response generated by one another.In some embodiments, sequential expression or administration of suchorthologs elicits low or no secondary immune response. The immuneorthogonal orthologs can avoid being neutralized by antibodies (e.g.,existing antibodies in the host before the orthologs are expressed oradministered). Cells expressing the orthologs can avoid being cleared bythe host's immune system (e.g., by activated CTLs). In some examples,CRISPR enzyme orthologs from different species may be immune orthogonalorthologs.

Immune orthogonal orthologs may be identified by analyzing thesequences, structures, and/or immunogenicity of a set of candidatesorthologs. In an example method, a set of immune orthogonal orthologsmay be identified by a) comparing the sequences of a set of candidateorthologs (e.g., orthologs from different species) to identify a subsetof candidates that have low or no sequence similarity; b) assessingimmune overlap among the members of the subset of candidates to identifycandidates that have no or low immune overlap. In some cases, immuneoverlap among candidates may be assessed by determining the binding(e.g., affinity) between a candidate ortholog and MEW (e.g., MEW type Iand/or MEW II) of the host. Alternatively or additionally, immuneoverlap among candidates may be assessed by determining B-cell epitopesfor the candidate orthologs. In one example, immune orthogonal orthologsmay be identified using the method described in Moreno A M et al.,BioRxiv, published online Jan. 10, 2018, doi: doi.org/10.1101/245985.

Patient-Specific Screening Methods

A nucleic acid-targeting system that targets RNA can be used to screenpatients or patient samples for the presence of particular RNA.

Transcript Detection Methods

The effector proteins and systems of the invention are useful forspecific detection of RNAs in a cell or other sample. In the presence ofan RNA target of interest, guide-dependent CRISPR-Cas nuclease activitymay be accompanied by non-specific RNAse activity against collateraltargets. To take advantage of the RNase activity, all that is needed isa reporter substrate that can be detectably cleaved. For example, areporter molecule can comprise RNA, tagged with a fluorescent reportermolecule (fluor) on one end and a quencher on the other. In the absenceof CRISPR-Cas RNase activity, the physical proximity of the quencherdampens fluorescence from the fluor to low levels. When CRISPR-Castarget specific cleavage is activated by the presence of an RNAtarget-of-interest and suitable guide RNA, the RNA-containing reportermolecule is non-specifically cleaved and the fluor and quencher arespatially separated. This causes the fluor to emit a detectable signalwhen excited by light of the appropriate wavelength. In one exemplaryassay method, CRISPR-Cas effector, target-of-interest-specific guideRNA, and reporter molecule are added to a cellular sample. An increasein fluorescence indicates the presence of the RNA target-of-interest. Inanother exemplary method, a detection array is provided. Each locationof the array is provided with CRISPR-Cas effector, reporter molecule,and a target-of-interest-specific guide RNA. Depending on the assay tobe performed, the target-of-interest-specific guide RNAs at eachlocation of the array can be the same, different, or a combinationthereof. Different target-of-interest-specific guide RNAs might beprovided, for example when it is desired to test for one or more targetsin a single source sample. The same target-of-interest-specific guideRNA might be provided at each location, for example when it is desiredto test multiple samples for the same target.

In certain embodiments, CRISPR-Cas is provided or expressed in an invitro system or in a cell, transiently or stably, and targeted ortriggered to non-specifically cleave cellular nucleic acids. In oneembodiment, CRISPR-Cas is engineered to knock down ssDNA, for exampleviral ssDNA. In another embodiment, CRISPR-Cas is engineered to knockdown RNA. The system can be devised such that the knockdown is dependenton a target DNA present in the cell or in vitro system, or triggered bythe addition of a target nucleic acid to the system or cell.

In an embodiment, the CRISPR-Cas system is engineered tonon-specifically cleave RNA in a subset of cells distinguishable by thepresence of an aberrant DNA sequence, for instance where cleavage of theaberrant DNA might be incomplete or ineffectual. In one non-limitingexample, a DNA translocation that is present in a cancer cell and drivescell transformation is targeted. Whereas a subpopulation of cells thatundergoes chromosomal DNA and repair may survive, non-specificcollateral ribonuclease activity advantageously leads to cell death ofpotential survivors.

Additional Aspects of Application

The invention has a broad spectrum of applications in, e.g., genetherapy, drug screening, disease diagnosis, and prognosis.

The terms “polynucleotide”, “nucleotide”, “nucleotide sequence”,“nucleic acid” and “oligonucleotide” are used interchangeably. Theyrefer to a polymeric form of nucleotides of any length, eitherdeoxyribonucleotides or ribonucleotides, or analogs thereof.Polynucleotides may have any three dimensional structure, and mayperform any function, known or unknown. The following are non-limitingexamples of polynucleotides: coding or non-coding regions of a gene orgene fragment, loci (locus) defined from linkage analysis, exons,introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, shortinterfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA),ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides,plasmids, vectors, isolated DNA of any sequence, isolated RNA of anysequence, nucleic acid probes, and primers. The term also encompassesnucleic-acid-like structures with synthetic backbones, see, e.g.,Eckstein, 1991; Baserga et al., 1992; Milligan, 1993; WO 97/03211; WO96/39154; Mata, 1997; Strauss-Soukup, 1997; and Samstag, 1996. Apolynucleotide may comprise one or more modified nucleotides, such asmethylated nucleotides and nucleotide analogs. If present, modificationsto the nucleotide structure may be imparted before or after assembly ofthe polymer. The sequence of nucleotides may be interrupted bynon-nucleotide components. A polynucleotide may be further modifiedafter polymerization, such as by conjugation with a labeling component.As used herein the term “wild type” is a term of the art understood byskilled persons and means the typical form of an organism, strain, geneor characteristic as it occurs in nature as distinguished from mutant orvariant forms. A “wild type” can be a base line. As used herein the term“variant” should be taken to mean the exhibition of qualities that havea pattern that deviates from what occurs in nature. The terms“non-naturally occurring” or “engineered” are used interchangeably andindicate the involvement of the hand of man. The terms, when referringto nucleic acid molecules or polypeptides mean that the nucleic acidmolecule or the polypeptide is at least substantially free from at leastone other component with which they are naturally associated in natureand as found in nature. “Complementarity” refers to the ability of anucleic acid to form hydrogen bond(s) with another nucleic acid sequenceby either traditional Watson-Crick base pairing or other non-traditionaltypes. A percent complementarity indicates the percentage of residues ina nucleic acid molecule which can form hydrogen bonds (e.g.,Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5,6, 7, 8, 9, 10 out of 10 being 50%, 60%, 70%, 80%, 90%, and 100%complementary). “Perfectly complementary” means that all the contiguousresidues of a nucleic acid sequence will hydrogen bond with the samenumber of contiguous residues in a second nucleic acid sequence.“Substantially complementary” as used herein refers to a degree ofcomplementarity that is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%,97%, 98%, 99%, or 100% over a region of 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, or morenucleotides, or refers to two nucleic acids that hybridize understringent conditions. As used herein, “stringent conditions” forhybridization refer to conditions under which a nucleic acid havingcomplementarity to a target sequence predominantly hybridizes with thetarget sequence, and substantially does not hybridize to non-targetsequences. Stringent conditions are generally sequence-dependent, andvary depending on a number of factors. In general, the longer thesequence, the higher the temperature at which the sequence specificallyhybridizes to its target sequence. Non-limiting examples of stringentconditions are described in detail in Tijssen (1993), LaboratoryTechniques In Biochemistry And Molecular Biology-Hybridization WithNucleic Acid Probes Part I, Second Chapter “Overview of principles ofhybridization and the strategy of nucleic acid probe assay”, Elsevier,N.Y. Where reference is made to a polynucleotide sequence, thencomplementary or partially complementary sequences are also envisaged.These are preferably capable of hybridizing to the reference sequenceunder highly stringent conditions. Generally, in order to maximize thehybridization rate, relatively low-stringency hybridization conditionsare selected: about 20 to 25° C. lower than the thermal melting point(T_(m)). The T_(m) is the temperature at which 50% of specific targetsequence hybridizes to a perfectly complementary probe in solution at adefined ionic strength and pH. Generally, in order to require at leastabout 85% nucleotide complementarity of hybridized sequences, highlystringent washing conditions are selected to be about 5 to 15° C. lowerthan the T_(m). In order to require at least about 70% nucleotidecomplementarity of hybridized sequences, moderately-stringent washingconditions are selected to be about 15 to 30° C. lower than the T.Highly permissive (very low stringency) washing conditions may be as lowas 50° C. below the T_(m), allowing a high level of mis-matching betweenhybridized sequences. Those skilled in the art will recognize that otherphysical and chemical parameters in the hybridization and wash stagescan also be altered to affect the outcome of a detectable hybridizationsignal from a specific level of homology between target and probesequences. Preferred highly stringent conditions comprise incubation in50% formamide, 5×SSC, and 1% SDS at 42° C., or incubation in 5×SSC and1% SDS at 65° C., with wash in 0.2×SSC and 0.1% SDS at 65° C.“Hybridization” refers to a reaction in which one or morepolynucleotides react to form a complex that is stabilized via hydrogenbonding between the bases of the nucleotide residues. The hydrogenbonding may occur by Watson Crick base pairing, Hoogstein binding, or inany other sequence specific manner. The complex may comprise two strandsforming a duplex structure, three or more strands forming a multistranded complex, a single self-hybridizing strand, or any combinationof these. A hybridization reaction may constitute a step in a moreextensive process, such as the initiation of PCR, or the cleavage of apolynucleotide by an enzyme. A sequence capable of hybridizing with agiven sequence is referred to as the “complement” of the given sequence.As used herein, the term “genomic locus” or “locus” (plural loci) is thespecific location of a gene or DNA sequence on a chromosome. A “gene”refers to stretches of DNA or RNA that encode a polypeptide or an RNAchain that has functional role to play in an organism and hence is themolecular unit of heredity in living organisms. For the purpose of thisinvention it may be considered that genes include regions which regulatethe production of the gene product, whether or not such regulatorysequences are adjacent to coding and/or transcribed sequences.Accordingly, a gene includes, but is not necessarily limited to,promoter sequences, terminators, translational regulatory sequences suchas ribosome binding sites and internal ribosome entry sites, enhancers,silencers, insulators, boundary elements, replication origins, matrixattachment sites and locus control regions. As used herein, “expressionof a genomic locus” or “gene expression” is the process by whichinformation from a gene is used in the synthesis of a functional geneproduct. The products of gene expression are often proteins, but innon-protein coding genes such as rRNA genes or tRNA genes, the productis functional RNA. The process of gene expression is used by all knownlife—eukaryotes (including multicellular organisms), prokaryotes(bacteria and archaea) and viruses to generate functional products tosurvive. As used herein “expression” of a gene or nucleic acidencompasses not only cellular gene expression, but also thetranscription and translation of nucleic acid(s) in cloning systems andin any other context. As used herein, “expression” also refers to theprocess by which a polynucleotide is transcribed from a DNA template(such as into and mRNA or other RNA transcript) and/or the process bywhich a transcribed mRNA is subsequently translated into peptides,polypeptides, or proteins. Transcripts and encoded polypeptides may becollectively referred to as “gene product.” If the polynucleotide isderived from genomic DNA, expression may include splicing of the mRNA ina eukaryotic cell. The terms “polypeptide”, “peptide” and “protein” areused interchangeably herein to refer to polymers of amino acids of anylength. The polymer may be linear or branched, it may comprise modifiedamino acids, and it may be interrupted by non-amino acids. The termsalso encompass an amino acid polymer that has been modified; forexample, disulfide bond formation, glycosylation, lipidation,acetylation, phosphorylation, or any other manipulation, such asconjugation with a labeling component. As used herein the term “aminoacid” includes natural and/or unnatural or synthetic amino acids,including glycine and both the D or L optical isomers, and amino acidanalogs and peptidomimetics. As used herein, the term “domain” or“protein domain” refers to a part of a protein sequence that may existand function independently of the rest of the protein chain. Asdescribed in aspects of the invention, sequence identity is related tosequence homology. Homology comparisons may be conducted by eye, or moreusually, with the aid of readily available sequence comparison programs.These commercially available computer programs may calculate percent (%)homology between two or more sequences and may also calculate thesequence identity shared by two or more amino acid or nucleic acidsequences.

As used herein the term “wild type” is a term of the art understood byskilled persons and means the typical form of an organism, strain, geneor characteristic as it occurs in nature as distinguished from mutant orvariant forms. A “wild type” can be a base line.

As used herein the term “variant” should be taken to mean the exhibitionof qualities that have a pattern that deviates from what occurs innature. The terms “non-naturally occurring” or “engineered” are usedinterchangeably and indicate the involvement of the hand of man. Theterms, when referring to nucleic acid molecules or polypeptides meanthat the nucleic acid molecule or the polypeptide is at leastsubstantially free from at least one other component with which they arenaturally associated in nature and as found in nature. In all aspectsand embodiments, whether they include these terms or not, it will beunderstood that, preferably, the may be optional and thus preferablyincluded or not preferably not included. Furthermore, the terms“non-naturally occurring” and “engineered” may be used interchangeablyand so can therefore be used alone or in combination and one or othermay replace mention of both together. In particular, “engineered” ispreferred in place of “non-naturally occurring” or “non-naturallyoccurring and/or engineered.”

Sequence homologies may be generated by any of a number of computerprograms known in the art, for example BLAST or FASTA, etc. A suitablecomputer program for carrying out such an alignment is the GCG WisconsinBestfit package (University of Wisconsin, U.S.A; Devereux et al., 1984,Nucleic Acids Research 12:387). Examples of other software than mayperform sequence comparisons include, but are not limited to, the BLASTpackage (see Ausubel et al., 1999 ibid—Chapter 18), FASTA (Atschul etal., 1990, J. Mol. Biol., 403-410) and the GENEWORKS suite of comparisontools. Both BLAST and FASTA are available for offline and onlinesearching (see Ausubel et al., 1999 ibid, pages 7-58 to 7-60). Howeverit is preferred to use the GCG Bestfit program. Percentage (%) sequencehomology may be calculated over contiguous sequences, i.e., one sequenceis aligned with the other sequence and each amino acid or nucleotide inone sequence is directly compared with the corresponding amino acid ornucleotide in the other sequence, one residue at a time. This is calledan “ungapped” alignment. Typically, such ungapped alignments areperformed only over a relatively short number of residues. Although thisis a very simple and consistent method, it fails to take intoconsideration that, for example, in an otherwise identical pair ofsequences, one insertion or deletion may cause the following amino acidresidues to be put out of alignment, thus potentially resulting in alarge reduction in % homology when a global alignment is performed.Consequently, most sequence comparison methods are designed to produceoptimal alignments that take into consideration possible insertions anddeletions without unduly penalizing the overall homology or identityscore. This is achieved by inserting “gaps” in the sequence alignment totry to maximize local homology or identity. However, these more complexmethods assign “gap penalties” to each gap that occurs in the alignmentso that, for the same number of identical amino acids, a sequencealignment with as few gaps as possible—reflecting higher relatednessbetween the two compared sequences—may achieve a higher score than onewith many gaps. “Affinity gap costs” are typically used that charge arelatively high cost for the existence of a gap and a smaller penaltyfor each subsequent residue in the gap. This is the most commonly usedgap scoring system. High gap penalties may, of course, produce optimizedalignments with fewer gaps. Most alignment programs allow the gappenalties to be modified. However, it is preferred to use the defaultvalues when using such software for sequence comparisons. For example,when using the GCG Wisconsin Bestfit package the default gap penalty foramino acid sequences is −12 for a gap and −4 for each extension.Calculation of maximum % homology therefore first requires theproduction of an optimal alignment, taking into consideration gappenalties. A suitable computer program for carrying out such analignment is the GCG Wisconsin Bestfit package (Devereux et al., 1984Nuc. Acids Research 12 p 387). Examples of other software than mayperform sequence comparisons include, but are not limited to, the BLASTpackage (see Ausubel et al., 1999 Short Protocols in Molecular Biology,4th Ed. —Chapter 18), FASTA (Altschul et al., 1990 J. Mol. Biol.403-410) and the GENEWORKS suite of comparison tools. Both BLAST andFASTA are available for offline and online searching (see Ausubel etal., 1999, Short Protocols in Molecular Biology, pages 7-58 to 7-60).However, for some applications, it is preferred to use the GCG Bestfitprogram. A new tool, called BLAST 2 Sequences is also available forcomparing protein and nucleotide sequences (see FEMS Microbiol Lett.1999 174(2): 247-50; FEMS Microbiol Lett. 1999 177(1): 187-8 and thewebsite of the National Center for Biotechnology information at thewebsite of the National Institutes for Health). Although the final %homology may be measured in terms of identity, the alignment processitself is typically not based on an all-or-nothing pair comparison.Instead, a scaled similarity score matrix is generally used that assignsscores to each pair-wise comparison based on chemical similarity orevolutionary distance. An example of such a matrix commonly used is theBLOSUM62 matrix—the default matrix for the BLAST suite of programs. GCGWisconsin programs generally use either the public default values or acustom symbol comparison table, if supplied (see user manual for furtherdetails). For some applications, it is preferred to use the publicdefault values for the GCG package, or in the case of other software,the default matrix, such as BLOSUM62. Alternatively, percentagehomologies may be calculated using the multiple alignment feature inDNASIS™ (Hitachi Software), based on an algorithm, analogous to CLUSTAL(Higgins D G & Sharp P M (1988), Gene 73(1), 237-244). Once the softwarehas produced an optimal alignment, it is possible to calculate %homology, preferably % sequence identity. The software typically doesthis as part of the sequence comparison and generates a numericalresult. The sequences may also have deletions, insertions orsubstitutions of amino acid residues which produce a silent change andresult in a functionally equivalent substance. Deliberate amino acidsubstitutions may be made on the basis of similarity in amino acidproperties (such as polarity, charge, solubility, hydrophobicity,hydrophilicity, and/or the amphipathic nature of the residues) and it istherefore useful to group amino acids together in functional groups.Amino acids may be grouped together based on the properties of theirside chains alone. However, it is more useful to include mutation dataas well. The sets of amino acids thus derived are likely to be conservedfor structural reasons. These sets may be described in the form of aVenn diagram (Livingstone C.D. and Barton G. J. (1993) “Protein sequencealignments: a strategy for the hierarchical analysis of residueconservation” Comput. Appl. Biosci. 9: 745-756) (Taylor W.R. (1986) “Theclassification of amino acid conservation” J. Theor. Biol. 119;205-218). Conservative may be made, for example according to Table 7which describes a generally accepted Venn diagram grouping of aminoacids.

The terms “subject,” “individual,” and “patient” are usedinterchangeably herein to refer to a vertebrate, preferably a mammal,more preferably a human. Mammals include, but are not limited to,murines, simians, humans, farm animals, sport animals, and pets.Tissues, cells and their progeny of a biological entity obtained in vivoor cultured in vitro are also encompassed.

The terms “therapeutic agent”, “therapeutic capable agent” or “treatmentagent” are used interchangeably and refer to a molecule or compound thatconfers some beneficial effect upon administration to a subject. Thebeneficial effect includes enablement of diagnostic determinations;amelioration of a disease, symptom, disorder, or pathological condition;reducing or preventing the onset of a disease, symptom, disorder orcondition; and generally counteracting a disease, symptom, disorder orpathological condition. As used herein, “treatment” or “treating,” or“palliating” or “ameliorating” are used interchangeably. These termsrefer to an approach for obtaining beneficial or desired resultsincluding but not limited to a therapeutic benefit and/or a prophylacticbenefit. By therapeutic benefit is meant any therapeutically relevantimprovement in or effect on one or more diseases, conditions, orsymptoms under treatment. For prophylactic benefit, the compositions maybe administered to a subject at risk of developing a particular disease,condition, or symptom, or to a subject reporting one or more of thephysiological symptoms of a disease, even though the disease, condition,or symptom may not have yet been manifested. The term “effective amount”or “therapeutically effective amount” refers to the amount of an agentthat is sufficient to effect beneficial or desired results. Thetherapeutically effective amount may vary depending upon one or more of:the subject and disease condition being treated, the weight and age ofthe subject, the severity of the disease condition, the manner ofadministration and the like, which can readily be determined by one ofordinary skill in the art. The term also applies to a dose that willprovide an image for detection by any one of the imaging methodsdescribed herein. The specific dose may vary depending on one or moreof: the particular agent chosen, the dosing regimen to be followed,whether it is administered in combination with other compounds, timingof administration, the tissue to be imaged, and the physical deliverysystem in which it is carried.

The practice of the present invention employs, unless otherwiseindicated, conventional techniques of immunology, biochemistry,chemistry, molecular biology, microbiology, cell biology, genomics andrecombinant DNA, which are within the skill of the art. See Sambrook,Fritsch and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL, 2ndedition (1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY (F. M. Ausubel,et al. eds., (1987)); the series METHODS IN ENZYMOLOGY (Academic Press,Inc.): PCR 2: A PRACTICAL APPROACH (M. J. MacPherson, B. D. Hames and G.R. Taylor eds. (1995)), Harlow and Lane, eds. (1988) ANTIBODIES, ALABORATORY MANUAL, and ANIMAL CELL CULTURE (R.I. Freshney, ed. (1987)).Several aspects of the invention relate to vector systems comprising oneor more vectors, or vectors as such. Vectors can be designed forexpression of CRISPR transcripts (e.g. nucleic acid transcripts,proteins, or enzymes) in prokaryotic or eukaryotic cells. For example,CRISPR transcripts can be expressed in bacterial cells such asEscherichia coli, insect cells (using baculovirus expression vectors),yeast cells, or mammalian cells. Suitable host cells are discussedfurther in Goeddel, GENE EXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY185, Academic Press, San Diego, Calif. (1990). Alternatively, therecombinant expression vector can be transcribed and translated invitro, for example using T7 promoter regulatory sequences and T7polymerase. Embodiments of the invention include sequences (bothpolynucleotide or polypeptide) which may comprise homologoussubstitution (substitution and replacement are both used herein to meanthe interchange of an existing amino acid residue or nucleotide, with analternative residue or nucleotide) that may occur i.e., like-for-likesubstitution in the case of amino acids such as basic for basic, acidicfor acidic, polar for polar, etc. Non-homologous substitution may alsooccur i.e., from one class of residue to another or alternativelyinvolving the inclusion of unnatural amino acids such as ornithine(hereinafter referred to as Z), diaminobutyric acid ornithine(hereinafter referred to as B), norleucine ornithine (hereinafterreferred to as O), pyriylalanine, thienylalanine, naphthylalanine andphenylglycine. Variant amino acid sequences may include suitable spacergroups that may be inserted between any two amino acid residues of thesequence including alkyl groups such as methyl, ethyl or propyl groupsin addition to amino acid spacers such as glycine or β-alanine residues.A further form of variation, which involves the presence of one or moreamino acid residues in peptoid form, may be well understood by thoseskilled in the art. For the avoidance of doubt, “the peptoid form” isused to refer to variant amino acid residues wherein the α-carbonsubstituent group is on the residue's nitrogen atom rather than theα-carbon. Processes for preparing peptides in the peptoid form are knownin the art, for example Simon R J et al., PNAS (1992) 89(20), 9367-9371and Horwell D C, Trends Biotechnol. (1995) 13(4), 132-134. Homologymodelling: Corresponding residues in other CRISPR-Cas orthologs can beidentified by the methods of Zhang et al., 2012 (Nature; 490(7421):556-60) and Chen et al., 2015 (PLoS Comput Biol; 11(5): e1004248)—acomputational protein-protein interaction (PPI) method to predictinteractions mediated by domain-motif interfaces. PrePPI (PredictingPPI), a structure based PPI prediction method, combines structuralevidence with non-structural evidence using a Bayesian statisticalframework. The method involves taking a pair a query proteins and usingstructural alignment to identify structural representatives thatcorrespond to either their experimentally determined structures orhomology models. Structural alignment is further used to identify bothclose and remote structural neighbors by considering global and localgeometric relationships. Whenever two neighbors of the structuralrepresentatives form a complex reported in the Protein Data Bank, thisdefines a template for modelling the interaction between the two queryproteins. Models of the complex are created by superimposing therepresentative structures on their corresponding structural neighbor inthe template. This approach is further described in Dey et al., 2013(Prot Sci; 22: 359-66).

For purpose of this invention, amplification means any method employinga primer and a polymerase capable of replicating a target sequence withreasonable fidelity. Amplification may be carried out by natural orrecombinant DNA polymerases such as TaqGold™, T7 DNA polymerase, Klenowfragment of E. coli DNA polymerase, and reverse transcriptase. Apreferred amplification method is PCR. In certain aspects the inventioninvolves vectors. A used herein, a “vector” is a tool that allows orfacilitates the transfer of an entity from one environment to another.It is a replicon, such as a plasmid, phage, or cosmid, into whichanother DNA segment may be inserted so as to bring about the replicationof the inserted segment. Generally, a vector is capable of replicationwhen associated with the proper control elements. In general, the term“vector” refers to a nucleic acid molecule capable of transportinganother nucleic acid to which it has been linked. Vectors include, butare not limited to, nucleic acid molecules that are single-stranded,double-stranded, or partially double-stranded; nucleic acid moleculesthat comprise one or more free ends, no free ends (e.g., circular);nucleic acid molecules that comprise DNA, RNA, or both; and othervarieties of polynucleotides known in the art. One type of vector is a“plasmid,” which refers to a circular double stranded DNA loop intowhich additional DNA segments can be inserted, such as by standardmolecular cloning techniques. Another type of vector is a viral vector,wherein virally-derived DNA or RNA sequences are present in the vectorfor packaging into a virus (e.g., retroviruses, replication defectiveretroviruses, adenoviruses, replication defective adenoviruses, andadeno-associated viruses (AAVs)). Viral vectors also includepolynucleotides carried by a virus for transfection into a host cell.Certain vectors are capable of autonomous replication in a host cellinto which they are introduced (e.g., bacterial vectors having abacterial origin of replication and episomal mammalian vectors). Othervectors (e.g., non-episomal mammalian vectors) are integrated into thegenome of a host cell upon introduction into the host cell, and therebyare replicated along with the host genome. Moreover, certain vectors arecapable of directing the expression of genes to which they areoperatively-linked. Such vectors are referred to herein as “expressionvectors.” Common expression vectors of utility in recombinant DNAtechniques are often in the form of plasmids. Recombinant expressionvectors can comprise a nucleic acid of the invention in a form suitablefor expression of the nucleic acid in a host cell, which means that therecombinant expression vectors include one or more regulatory elements,which may be selected on the basis of the host cells to be used forexpression, that is operatively-linked to the nucleic acid sequence tobe expressed. Within a recombinant expression vector, “operably linked”is intended to mean that the nucleotide sequence of interest is linkedto the regulatory element(s) in a manner that allows for expression ofthe nucleotide sequence (e.g., in an in vitro transcription/translationsystem or in a host cell when the vector is introduced into the hostcell). With regards to recombination and cloning methods, mention ismade of U.S. patent application Ser. No. 10/815,730, published Sep. 2,2004 as US 2004-0171156 A1, the contents of which are hereinincorporated by reference in their entirety. Aspects of the inventionrelate to bicistronic vectors for guide RNA and wild type, modified ormutated CRISPR effector proteins/enzymes (e.g. Cas13 effector proteins).Bicistronic expression vectors guide RNA and wild type, modified ormutated CRISPR effector proteins/enzymes (e.g. Cas13 effector proteins)are preferred. In general and particularly in this embodiment and wildtype, modified or mutated CRISPR effector proteins/enzymes (e.g. Cas13effector proteins) is preferably driven by the CBh promoter. The RNA maypreferably be driven by a Pol III promoter, such as a U6 promoter.Ideally the two are combined.

In some embodiments, a loop in the guide RNA or crRNA is provided. Thismay be a stem loop or a tetra loop. The loop is preferably GAAA, but itis not limited to this sequence or indeed to being only 4 bp in length.Indeed, preferred loop forming sequences for use in hairpin structuresare four nucleotides in length, and most preferably have the sequenceGAAA. However, longer or shorter loop sequences may be used, as mayalternative sequences. The sequences preferably include a nucleotidetriplet (for example, AAA), and an additional nucleotide (for example Cor G). Examples of loop forming sequences include CAAA and AAAG.

In practicing any of the methods disclosed herein, a suitable vector canbe introduced to a cell or an embryo via one or more methods known inthe art, including without limitation, microinjection, electroporation,sonoporation, biolistics, calcium phosphate-mediated transfection,cationic transfection, liposome transfection, dendrimer transfection,heat shock transfection, nucleofection transfection, magnetofection,lipofection, impalefection, optical transfection, proprietaryagent-enhanced uptake of nucleic acids, and delivery via liposomes,immunoliposomes, virosomes, or artificial virions. In some methods, thevector is introduced into an embryo by microinjection. The vector orvectors may be microinjected into the nucleus or the cytoplasm of theembryo. In some methods, the vector or vectors may be introduced into acell by nucleofection.

Vectors can be designed for expression of CRISPR transcripts (e.g.,nucleic acid transcripts, proteins, or enzymes) in prokaryotic oreukaryotic cells. For example, CRISPR transcripts can be expressed inbacterial cells such as Escherichia coli, insect cells (usingbaculovirus expression vectors), yeast cells, or mammalian cells.Suitable host cells are discussed further in Goeddel, GENE EXPRESSIONTECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif.(1990). Alternatively, the recombinant expression vector can betranscribed and translated in vitro, for example using T7 promoterregulatory sequences and T7 polymerase.

Vectors may be introduced and propagated in a prokaryote or prokaryoticcell. In some embodiments, a prokaryote is used to amplify copies of avector to be introduced into a eukaryotic cell or as an intermediatevector in the production of a vector to be introduced into a eukaryoticcell (e.g., amplifying a plasmid as part of a viral vector packagingsystem). In some embodiments, a prokaryote is used to amplify copies ofa vector and express one or more nucleic acids, such as to provide asource of one or more proteins for delivery to a host cell or hostorganism. Expression of proteins in prokaryotes is most often carriedout in Escherichia coli with vectors containing constitutive orinducible promoters directing the expression of either fusion ornon-fusion proteins. Fusion vectors add a number of amino acids to aprotein encoded therein, such as to the amino terminus of therecombinant protein. Such fusion vectors may serve one or more purposes,such as: (i) to increase expression of recombinant protein; (ii) toincrease the solubility of the recombinant protein; and (iii) to aid inthe purification of the recombinant protein by acting as a ligand inaffinity purification. Often, in fusion expression vectors, aproteolytic cleavage site is introduced at the junction of the fusionmoiety and the recombinant protein to enable separation of therecombinant protein from the fusion moiety subsequent to purification ofthe fusion protein. Such enzymes, and their cognate recognitionsequences, include Factor Xa, thrombin and enterokinase. Example fusionexpression vectors include pGEX (Pharmacia Biotech Inc; Smith andJohnson, 1988. Gene 67: 31-40), pMAL (New England Biolabs, Beverly,Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) that fuse glutathioneS-transferase (GST), maltose E binding protein, or protein A,respectively, to the target recombinant protein. Examples of suitableinducible non-fusion E. coli expression vectors include pTrc (Amrann etal., (1988) Gene 69:301-315) and pET 11d (Studier et al., GENEEXPRESSION TECHNOLOGY: METHODS IN ENZYMOLOGY 185, Academic Press, SanDiego, Calif. (1990) 60-89). In some embodiments, a vector is a yeastexpression vector. Examples of vectors for expression in yeastSaccharomyces cerevisiae include pYepSec1 (Baldari, et al., 1987. EMBOJ. 6: 229-234), pMFa (Kuijan and Herskowitz, 1982. Cell 30: 933-943),pJRY88 (Schultz et al., 1987. Gene 54: 113-123), pYES2 (InvitrogenCorporation, San Diego, Calif.), and picZ (InVitrogen Corp, San Diego,Calif). In some embodiments, a vector drives protein expression ininsect cells using baculovirus expression vectors. Baculovirus vectorsavailable for expression of proteins in cultured insect cells (e.g., SF9cells) include the pAc series (Smith, et al., 1983. Mol. Cell. Biol. 3:2156-2165) and the pVL series (Lucklow and Summers, 1989. Virology 170:31-39). In some embodiments, a vector is capable of driving expressionof one or more sequences in mammalian cells using a mammalian expressionvector. Examples of mammalian expression vectors include pCDM8 (Seed,1987. Nature 329: 840) and pMT2PC (Kaufman, et al., 1987. EMBO J. 6:187-195). When used in mammalian cells, the expression vector's controlfunctions are typically provided by one or more regulatory elements. Forexample, commonly used promoters are derived from polyoma, adenovirus 2,cytomegalovirus, simian virus 40, and others disclosed herein and knownin the art. For other suitable expression systems for both prokaryoticand eukaryotic cells see, e.g., Chapters 16 and 17 of Sambrook, et al.,MOLECULAR CLONING: A LABORATORY MANUAL. 2nd ed., Cold Spring HarborLaboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor,N.Y., 1989. In some embodiments, the recombinant mammalian expressionvector is capable of directing expression of the nucleic acidpreferentially in a particular cell type (e.g., tissue-specificregulatory elements are used to express the nucleic acid).Tissue-specific regulatory elements are known in the art. Non-limitingexamples of suitable tissue-specific promoters include the albuminpromoter (liver-specific; Pinkert, et al., 1987. Genes Dev. 1: 268-277),lymphoid-specific promoters (Calame and Eaton, 1988. Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto andBaltimore, 1989. EMBO J. 8: 729-733) and immunoglobulins (Baneiji, etal., 1983. Cell 33: 729-740; Queen and Baltimore, 1983. Cell 33:741-748), neuron-specific promoters (e.g., the neurofilament promoter;Byrne and Ruddle, 1989. Proc. Natl. Acad. Sci. USA 86: 5473-5477),pancreas-specific promoters (Edlund, et al., 1985. Science 230:912-916), and mammary gland-specific promoters (e.g., milk wheypromoter; U.S. Pat. No. 4,873,316 and European Application PublicationNo. 264,166). Developmentally-regulated promoters are also encompassed,e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249:374-379) and the α-fetoprotein promoter (Campes and Tilghman, 1989.Genes Dev. 3: 537-546). With regards to these prokaryotic and eukaryoticvectors, mention is made of U.S. Pat. No. 6,750,059, the contents ofwhich are incorporated by reference herein in their entirety. Otherembodiments of the invention may relate to the use of viral vectors,with regards to which mention is made of U.S. patent application Ser.No. 13/092,085, the contents of which are incorporated by referenceherein in their entirety. Tissue-specific regulatory elements are knownin the art and in this regard, mention is made of U.S. Pat. No.7,776,321, the contents of which are incorporated by reference herein intheir entirety.

In some embodiments, a regulatory element is operably linked to one ormore elements of or encoding a CRISPR Cas system or complex so as todrive expression of the one or more elements of the CRISPR system. Ingeneral, CRISPRs (Clustered Regularly Interspaced Short PalindromicRepeats), also known as SPIDRs (SPacer Interspersed Direct Repeats),constitute a family of DNA loci that are usually specific to aparticular bacterial species. The CRISPR locus comprises a distinctclass of interspersed short sequence repeats (SSRs) that were recognizedin E. coli (Ishino et al., J. Bacteriol., 169:5429-5433 [1987]; andNakata et al., J. Bacteriol., 171:3553-3556 [1989]), and associatedgenes. Similar interspersed SSRs have been identified in Haloferaxmediterranei, Streptococcus pyogenes, Anabaena, and Mycobacteriumtuberculosis (See, Groenen et al., Mol. Microbiol., 10:1057-1065 [1993];Hoe et al., Emerg. Infect. Dis., 5:254-263 [1999]; Masepohl et al.,Biochim. Biophys. Acta 1307:26-30 [1996]; and Mojica et al., Mol.Microbiol., 17:85-93 [1995]). The CRISPR loci typically differ fromother SSRs by the structure of the repeats, which have been termed shortregularly spaced repeats (SRSRs) (Janssen et al., OMICS J. Integ. Biol.,6:23-33 [2002]; and Mojica et al., Mol. Microbiol., 36:244-246 [2000]).In general, the repeats are short elements that occur in clusters thatare regularly spaced by unique intervening sequences with asubstantially constant length (Mojica et al., [2000], supra). Althoughthe repeat sequences are highly conserved between strains, the number ofinterspersed repeats and the sequences of the spacer regions typicallydiffer from strain to strain (van Embden et al., J. Bacteriol.,182:2393-2401 [2000]). CRISPR loci have been identified in more than 40prokaryotes (See e.g., Jansen et al., Mol. Microbiol., 43:1565-1575[2002]; and Mojica et al., [2005]) including, but not limited toAeropyrum, Pyrobaculum, Sulfolobus, Archaeoglobus, Halocarcula,Methanobacterium, Methanococcus, Methanosarcina, Methanopyrus,Pyrococcus, Picrophilus, Thermoplasma, Corynebacterium, Mycobacterium,Streptomyces, Aquifex, Porphyromonas, Chlorobium, Thermus, Bacillus,Listeria, Staphylococcus, Clostridium, Thermoanaerobacter, Mycoplasma,Fusobacterium, Azarcus, Chromobacterium, Neisseria, Nitrosomonas,Desulfovibrio, Geobacter, Myxococcus, Campylobacter, Wolinella,Acinetobacter, Erwinia, Escherichia, Legionella, Methylococcus,Pasteurella, Photobacterium, Salmonella, Xanthomonas, Yersinia,Treponema, and Thermotoga.

In general, “RNA-targeting system” as used in the present applicationrefers collectively to transcripts and other elements involved in theexpression of or directing the activity of RNA-targetingCRISPR-associated 13 (“Cas13”) genes (also referred to herein as aneffector protein), including sequences encoding a RNA-targeting Cas(effector) protein and a guide RNA (or crRNA sequence), with referenceto the mutated CRISPR-Cas as herein discussed. In general, aRNA-targeting system is characterized by elements that promote theformation of a RNA-targeting complex at the site of a target sequence.In the context of formation of a RNA-targeting complex, “targetsequence” refers to a RNA sequence to which a guide sequence (or theguide or of the crRNA) is designed to have complementarity, wherehybridization between a target sequence and a guide RNA promotes theformation of a RNA-targeting complex. Full complementarity is notnecessarily required, provided there is sufficient complementarity tocause hybridization and promote formation of a RNA-targeting complex. Insome embodiments, a target sequence is located in the nucleus orcytoplasm of a cell. In some embodiments, the target sequence may bewithin an organelle of a eukaryotic cell. A sequence or template thatmay be used for recombination into the targeted locus comprising thetarget sequences is referred to as an “editing template” or “editingRNA” or “editing sequence”. In aspects of the invention, an exogenoustemplate RNA may be referred to as an editing template. In an aspect ofthe invention the recombination is homologous recombination. In general,a guide sequence is any polynucleotide sequence having sufficientcomplementarity with a target polynucleotide sequence to hybridize withthe target sequence and direct sequence-specific binding of a nucleicacid-targeting complex to the target sequence. In some embodiments, thedegree of complementarity between a guide sequence and its correspondingtarget sequence, when optimally aligned using a suitable alignmentalgorithm, is about or more than about 50%, 60%, 75%, 80%, 85%, 90%,95%, 97.5%, 99%, or more. Optimal alignment may be determined with theuse of any suitable algorithm for aligning sequences, non-limitingexample of which include the Smith-Waterman algorithm, theNeedleman-Wunsch algorithm, algorithms based on the Burrows-WheelerTransform (e.g. the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT,Novoalign (Novocraft Technologies, ELAND (Illumina, San Diego, Calif.),SOAP (available at soap.genomics.org.cn), and Maq (available atmaq.sourceforge.net). In some embodiments, a guide sequence is about ormore than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotidesin length. In some embodiments, a guide sequence is less than about 75,50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length. Theability of a guide sequence to direct sequence-specific binding of aRNA-targeting complex to a target sequence may be assessed by anysuitable assay. A template polynucleotide may be of any suitable length,such as about or more than about 10, 15, 20, 25, 50, 75, 100, 150, 200,500, 1000, or more nucleotides in length. In some embodiments, thetemplate polynucleotide is complementary to a portion of apolynucleotide comprising the target sequence. When optimally aligned, atemplate polynucleotide might overlap with one or more nucleotides of atarget sequences (e.g. about or more than about 1, 5, 10, 15, 20, 25,30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or more nucleotides). In someembodiments, when a template sequence and a polynucleotide comprising atarget sequence are optimally aligned, the nearest nucleotide of thetemplate polynucleotide is within about 1, 5, 10, 15, 20, 25, 50, 75,100, 200, 300, 400, 500, 1000, 5000, 10000, or more nucleotides from thetarget sequence. In some embodiments, the RNA-targeting effector proteinis part of a fusion protein comprising one or more heterologous proteindomains (e.g., about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,or more domains in addition to the nucleic acid-targeting effectorprotein). In some embodiments, the CRISPR Cas effector protein/enzyme ispart of a fusion protein comprising one or more heterologous proteindomains (e.g. about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ormore domains in addition to the CRISPR Cas enzyme). Examples of proteindomains that may be fused to an effector protein include, withoutlimitation, epitope tags, reporter gene sequences, and protein domainshaving one or more of the following activities: methylase activity,demethylase activity, transcription activation activity, transcriptionrepression activity, transcription release factor activity, histonemodification activity, RNA cleavage activity and nucleic acid bindingactivity. Non-limiting examples of epitope tags include histidine (His)tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags,VSV-G tags, and thioredoxin (Trx) tags. Examples of reporter genesinclude, but are not limited to, glutathione-S-transferase (GST),horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT)beta-galactosidase, beta-glucuronidase, luciferase, green fluorescentprotein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellowfluorescent protein (YFP), and autofluorescent proteins including bluefluorescent protein (BFP). A nucleic acid-targeting effector protein maybe fused to a gene sequence encoding a protein or a fragment of aprotein that bind DNA molecules or bind other cellular molecules,including but not limited to maltose binding protein (MBP), S-tag, Lex ADNA binding domain (DBD) fusions, GAL4 DNA binding domain fusions, andherpes simplex virus (HSV) BP16 protein fusions. Additional domains thatmay form part of a fusion protein comprising a nucleic acid-targetingeffector protein are described in US20110059502, incorporated herein byreference. In some embodiments, a tagged nucleic acid-targeting effectorprotein is used to identify the location of a target sequence. In someembodiments, a CRISPR Cas enzyme may form a component of an induciblesystem. The inducible nature of the system would allow forspatiotemporal control of gene editing or gene expression using a formof energy. The form of energy may include but is not limited toelectromagnetic radiation, sound energy, chemical energy and thermalenergy. Examples of inducible system include tetracycline induciblepromoters (Tet-On or Tet-Off), small molecule two-hybrid transcriptionactivations systems (FKBP, ABA, etc), or light inducible systems(Phytochrome, LOV domains, or cryptochrome). In one embodiment, theCRISPR CRISPR-Cas enzyme may be a part of a Light InducibleTranscriptional Effector (LITE) to direct changes in transcriptionalactivity in a sequence-specific manner. The components of a light mayinclude a CRISPR enzyme, a light-responsive cytochrome heterodimer (e.g.from Arabidopsis thaliana), and a transcriptional activation/repressiondomain. Further examples of inducible DNA binding proteins and methodsfor their use are provided in U.S. 61/736,465 and U.S. 61/721,283 and WO2014/018423 and U.S. Pat. Nos. 8,889,418, 8,895,308, US20140186919,US20140242700, US20140273234, US20140335620, WO2014093635, which ishereby incorporated by reference in its entirety. In some aspects, theinvention provides methods comprising delivering one or morepolynucleotides, such as or one or more vectors as described herein, oneor more transcripts thereof, and/or one or proteins transcribedtherefrom, to a host cell. In some aspects, the invention furtherprovides cells produced by such methods, and organisms (such as animals,plants, or fungi) comprising or produced from such cells. In someembodiments, a RNA-targeting effector protein in combination with (andoptionally complexed with) a guide RNA or crRNA is delivered to a cell.Conventional viral and non-viral based gene transfer methods can be usedto introduce nucleic acids in mammalian cells or target tissues. Suchmethods can be used to administer nucleic acids encoding components of aRNA-targeting system to cells in culture, or in a host organism.Non-viral vector delivery systems include DNA plasmids, RNA (e.g. atranscript of a vector described herein), naked nucleic acid, andnucleic acid complexed with a delivery vehicle, such as a liposome.Viral vector delivery systems include DNA and RNA viruses, which haveeither episomal or integrated genomes after delivery to the cell. For areview of gene therapy procedures, see Anderson, Science 256:808-813(1992); Nabel & Felgner, TIBTECH 11:211-217 (1993); Mitani & Caskey,TIBTECH 11:162-166 (1993); Dillon, TIBTECH 11:167-175 (1993); Miller,Nature 357:455-460 (1992); Van Brunt, Biotechnology 6(10):1149-1154(1988); Vigne, Restorative Neurology and Neuroscience 8:35-36 (1995);Kremer & Perricaudet, British Medical Bulletin 51(1):31-44 (1995);Haddada et al., in Current Topics in Microbiology and Immunology,Doerfler and Bohm (eds) (1995); and Yu et al., Gene Therapy 1:13-26(1994). Methods of non-viral delivery of nucleic acids includelipofection, nucleofection, microinjection, biolistics, virosomes,liposomes, immunoliposomes, polycation or lipid:nucleic acid conjugates,naked DNA, artificial virions, and agent-enhanced uptake of DNA.Lipofection is described in e.g., U.S. Pat. Nos. 5,049,386, 4,946,787;and 4,897,355) and lipofection reagents are sold commercially (e.g.,Transfectam™ and Lipofectin™) Cationic and neutral lipids that aresuitable for efficient receptor-recognition lipofection ofpolynucleotides include those of Felgner, WO 91/17424; WO 91/16024.Delivery can be to cells (e.g. in vitro or ex vivo administration) ortarget tissues (e.g. in vivo administration).

The nucleic acids-targeting systems, the vector systems, the vectors andthe compositions described herein may be used in various nucleicacids-targeting applications, altering or modifying synthesis of a geneproduct, such as a protein, nucleic acids cleavage, nucleic acidsediting, nucleic acids splicing; trafficking of target nucleic acids,tracing of target nucleic acids, isolation of target nucleic acids,visualization of target nucleic acids, etc.

Exemplary Delivery Methods

Through this disclosure and the knowledge in the art, TALEs, CRISPR-Cassystems, or components thereof or nucleic acid molecules thereof(including, for instance HDR template) or nucleic acid moleculesencoding or providing components thereof may be delivered by a deliverysystem herein described both generally and in detail.

Vector delivery, e.g., plasmid, viral delivery: The CRISPR enzyme,and/or any of the present RNAs, for instance a guide RNA, can bedelivered using any suitable vector, e.g., plasmid or viral vectors,such as adeno associated virus (AAV), lentivirus, adenovirus or otherviral vector types, or combinations thereof. Effector proteins and oneor more guide RNAs can be packaged into one or more vectors, e.g.,plasmid or viral vectors. In some embodiments, the vector, e.g., plasmidor viral vector is delivered to the tissue of interest by, for example,an intramuscular injection, while other times the delivery is viaintravenous, transdermal, intranasal, oral, mucosal, or other deliverymethods. Such delivery may be either via a single dose, or multipledoses. One skilled in the art understands that the actual dosage to bedelivered herein may vary greatly depending upon a variety of factors,such as the vector choice, the target cell, organism, or tissue, thegeneral condition of the subject to be treated, the degree oftransformation/modification sought, the administration route, theadministration mode, the type of transformation/modification sought,etc.

Such a dosage may further contain, for example, a carrier (water,saline, ethanol, glycerol, lactose, sucrose, calcium phosphate, gelatin,dextran, agar, pectin, peanut oil, sesame oil, etc.), a diluent, apharmaceutically-acceptable carrier (e.g., phosphate-buffered saline), apharmaceutically-acceptable excipient, and/or other compounds known inthe art. The dosage may further contain one or more pharmaceuticallyacceptable salts such as, for example, a mineral acid salt such as ahydrochloride, a hydrobromide, a phosphate, a sulfate, etc.; and thesalts of organic acids such as acetates, propionates, malonates,benzoates, etc. Additionally, auxiliary substances, such as wetting oremulsifying agents, pH buffering substances, gels or gelling materials,flavorings, colorants, microspheres, polymers, suspension agents, etc.may also be present herein. In addition, one or more other conventionalpharmaceutical ingredients, such as preservatives, humectants,suspending agents, surfactants, antioxidants, anticaking agents,fillers, chelating agents, coating agents, chemical stabilizers, etc.may also be present, especially if the dosage form is a reconstitutableform. Suitable exemplary ingredients include microcrystalline cellulose,carboxymethylcellulose sodium, polysorbate 80, phenylethyl alcohol,chlorobutanol, potassium sorbate, sorbic acid, sulfur dioxide, propylgallate, the parabens, ethyl vanillin, glycerin, phenol,parachlorophenol, gelatin, albumin and a combination thereof. A thoroughdiscussion of pharmaceutically acceptable excipients is available inREMINGTON'S PHARMACEUTICAL SCIENCES (Mack Pub. Co., N.J. 1991) which isincorporated by reference herein.

In an embodiment herein the delivery is via an adenovirus, which may beat a single booster dose containing at least 1×10⁵ particles (alsoreferred to as particle units, pu) of adenoviral vector. In anembodiment herein, the dose preferably is at least about 1×10⁶ particles(for example, about 1×10⁶-1×10¹² particles), more preferably at leastabout 1×10⁷ particles, more preferably at least about 1×10⁸ particles(e.g., about 1×10⁸-1×10¹¹ particles or about 1×10⁸-1×10¹² particles),and most preferably at least about 1×10⁰ particles (e.g., about1×10⁹-1×10¹⁰ particles or about 1×10⁹-1×10¹² particles), or even atleast about 1×10¹⁰ particles (e.g., about 1×10¹⁰-1×10¹² particles) ofthe adenoviral vector. Alternatively, the dose comprises no more thanabout 1×10¹⁴ particles, preferably no more than about 1×10¹³ particles,even more preferably no more than about 1×10¹² particles, even morepreferably no more than about 1×10¹¹ particles, and most preferably nomore than about 1×10¹⁰ particles (e.g., no more than about 1×10⁹articles). Thus, the dose may contain a single dose of adenoviral vectorwith, for example, about 1×10⁶ particle units (pu), about 2×10⁶ pu,about 4×10⁶ pu, about 1×10⁷ pu, about 2×10⁷ pu, about 4×10⁷ pu, about1×10⁸ pu, about 2×10⁸ pu, about 4×10⁸ pu, about 1×10⁹ pu, about 2×10⁹pu, about 4×10⁹ pu, about 1×10¹⁰ pu, about 2×10¹⁰ pu, about 4×10¹⁰ pu,about 1×10¹¹ pu, about 2×10¹¹ pu, about 4×10¹¹ pu, about 1×10¹² pu,about 2×10¹² pu, or about 4×10¹² pu of adenoviral vector. See, forexample, the adenoviral vectors in U.S. Pat. No. 8,454,972 B2 to Nabel,et. al., granted on Jun. 4, 2013; incorporated by reference herein, andthe dosages at col 29, lines 36-58 thereof. In an embodiment herein, theadenovirus is delivered via multiple doses.

In an embodiment herein, the delivery is via an AAV. A therapeuticallyeffective dosage for in vivo delivery of the AAV to a human is believedto be in the range of from about 20 to about 50 ml of saline solutioncontaining from about 1×10¹⁰ to about 1×10¹⁰ functional AAV/ml solution.The dosage may be adjusted to balance the therapeutic benefit againstany side effects. In an embodiment herein, the AAV dose is generally inthe range of concentrations of from about 1×10⁵ to 1×10⁵⁰ genomes AAV,from about 1×10⁸ to 1×10²⁰ genomes AAV, from about 1×10¹⁰ to about1×10¹⁶ genomes, or about 1×10¹¹ to about 1×10¹⁶ genomes AAV. A humandosage may be about 1×10¹³ genomes AAV. Such concentrations may bedelivered in from about 0.001 ml to about 100 ml, about 0.05 to about 50ml, or about 10 to about 25 ml of a carrier solution. Other effectivedosages can be readily established by one of ordinary skill in the artthrough routine trials establishing dose response curves. See, forexample, U.S. Pat. No. 8,404,658 B2 to Hajjar, et al., granted on Mar.26, 2013, at col. 27, lines 45-60.

In an embodiment herein the delivery is via a plasmid. In such plasmidcompositions, the dosage should be a sufficient amount of plasmid toelicit a response. For instance, suitable quantities of plasmid DNA inplasmid compositions can be from about 0.1 to about 2 mg, or from about1 μg to about 10 μg per 70 kg individual. Plasmids of the invention willgenerally comprise (i) a promoter; (ii) a sequence encoding an nucleicacid-targeting CRISPR enzyme, operably linked to said promoter; (iii) aselectable marker; (iv) an origin of replication; and (v) atranscription terminator downstream of and operably linked to (ii). Theplasmid can also encode the RNA components of a CRISPR complex, but oneor more of these may instead be encoded on a different vector.

The doses herein are based on an average 70 kg individual. The frequencyof administration is within the ambit of the medical or veterinarypractitioner (e.g., physician, veterinarian), or scientist skilled inthe art. It is also noted that mice used in experiments are typicallyabout 20 g and from mice experiments one can scale up to a 70 kgindividual.

In some embodiments the RNA molecules of the invention are delivered inliposome or lipofectin formulations and the like and can be prepared bymethods well known to those skilled in the art. Such methods aredescribed, for example, in U.S. Pat. Nos. 5,593,972, 5,589,466, and5,580,859, which are herein incorporated by reference. Delivery systemsaimed specifically at the enhanced and improved delivery of siRNA intomammalian cells have been developed, (see, for example, Shen et al FEBSLet. 2003, 539:111-114; Xia et al., Nat. Biotech. 2002, 20:1006-1010;Reich et al., Mol. Vision. 2003, 9: 210-216; Sorensen et al., J. Mol.Biol. 2003, 327: 761-766; Lewis et al., Nat. Gen. 2002, 32: 107-108 andSimeoni et al., NAR 2003, 31, 11: 2717-2724) and may be applied to thepresent invention. siRNA has recently been successfully used forinhibition of gene expression in primates (see for example. Tolentino etal., Retina 24(4):660 which may also be applied to the presentinvention.

Indeed, RNA delivery is a useful method of in vivo delivery. It ispossible to deliver nucleic acid-targeting Cas protein and guide RNA(and, for instance, HR repair template) into cells using liposomes orparticles. Thus delivery of the nucleic acid-targeting CRISPR-Casprotein and/or delivery of the guide RNAs or crRNAs of the invention maybe in RNA form and via microvesicles, liposomes or particles. Forexample, CRISPR-Cas mRNA and guide RNA or crRNA can be packaged intoliposomal particles for delivery in vivo. Liposomal transfectionreagents such as lipofectamine from Life Technologies and other reagentson the market can effectively deliver RNA molecules into the liver.

Means of delivery of RNA also preferred include delivery of RNA viananoparticles (Cho, S., Goldberg, M., Son, S., Xu, Q., Yang, F., Mei,Y., Bogatyrev, S., Langer, R. and Anderson, D., Lipid-like nanoparticlesfor small interfering RNA delivery to endothelial cells, AdvancedFunctional Materials, 19: 3112-3118, 2010) or exosomes (Schroeder, A.,Levins, C., Cortez, C., Langer, R., and Anderson, D., Lipid-basednanotherapeutics for siRNA delivery, Journal of Internal Medicine, 267:9-21, 2010, PMID: 20059641). Indeed, exosomes have been shown to beparticularly useful in delivery siRNA, a system with some parallels tothe RNA-targeting system. For instance, El-Andaloussi S, et al.(“Exosome-mediated delivery of siRNA in vitro and in vivo.” Nat Protoc.2012 December; 7(12):2112-26. doi: 10.1038/nprot.2012.131. Epub 2012Nov. 15.) describe how exosomes are promising tools for drug deliveryacross different biological barriers and can be harnessed for deliveryof siRNA in vitro and in vivo. Their approach is to generate targetedexosomes through transfection of an expression vector, comprising anexosomal protein fused with a peptide ligand. The exosomes are thenpurify and characterized from transfected cell supernatant, then RNA isloaded into the exosomes. Delivery or administration according to theinvention can be performed with exosomes, in particular but not limitedto the brain. Vitamin E (α-tocopherol) may be conjugated with nucleicacid-targeting Cas protein and delivered to the brain along with highdensity lipoprotein (HDL), for example in a similar manner as was doneby Uno et al. (HUMAN GENE THERAPY 22:711-719 (June 2011)) for deliveringshort-interfering RNA (siRNA) to the brain. Mice were infused viaOsmotic minipumps (model 1007D; Alzet, Cupertino, Calif.) filled withphosphate-buffered saline (PBS) or free TocsiBACE or Toc-siBACE/HDL andconnected with Brain Infusion Kit 3 (Alzet). A brain-infusion cannulawas placed about 0.5 mm posterior to the bregma at midline for infusioninto the dorsal third ventricle. Uno et al. found that as little as 3nmol of Toc-siRNA with HDL could induce a target reduction in comparabledegree by the same ICV infusion method. A similar dosage of nucleicacid-targeting effector protein conjugated to α-tocopherol andco-administered with HDL targeted to the brain may be contemplated forhumans in the present invention, for example, about 3 nmol to about 3μmol of nucleic acid-targeting effector protein targeted to the brainmay be contemplated. Zou et al. ((HUMAN GENE THERAPY 22:465-475 (April2011)) describes a method of lentiviral-mediated delivery ofshort-hairpin RNAs targeting PKCγ for in vivo gene silencing in thespinal cord of rats. Zou et al. administered about 10 μl of arecombinant lentivirus having a titer of 1×10⁹ transducing units (TU)/mlby an intrathecal catheter. A similar dosage of nucleic acid-targetingeffector protein expressed in a lentiviral vector targeted to the brainmay be contemplated for humans in the present invention, for example,about 10-50 ml of nucleic acid-targeting effector protein targeted tothe brain in a lentivirus having a titer of 1×10⁹ transducing units(TU)/ml may be contemplated.

In terms of local delivery to the brain, this can be achieved in variousways. For instance, material can be delivered intrastriatally e.g., byinjection. Injection can be performed stereotactically via a craniotomy.

Packaging and Promoters Generally

Ways to package RNA-targeting effector protein (CRISPR-Cas proteins)coding nucleic acid molecules, e.g., DNA, into vectors, e.g., viralvectors, to mediate genome modification in vivo include:

Single Virus Vector:

-   -   Vector containing two or more expression cassettes:    -   Promoter-nucleic acid-targeting effector protein coding nucleic        acid molecule-terminator    -   Promoter-guide RNA1-terminator    -   Promoter-guide RNA (N)-terminator (up to size limit of vector)

Double Virus Vector:

-   -   Vector 1 containing one expression cassette for driving the        expression of RNA-targeting effector protein (CRISPR-Cas)    -   Promoter-RNA-targeting effector (CRISPR-Cas) protein coding        nucleic acid molecule-terminator    -   Vector 2 containing one more expression cassettes for driving        the expression of one or more guideRNAs or crRNAs    -   Promoter-guide RNA1 or crRNA1-terminator    -   Promoter-guide RNA1 (N) or crRNA1 (N)-terminator (up to size        limit of vector).

The promoter used to drive RNA-targeting effector protein coding nucleicacid molecule expression can include AAV ITR can serve as a promoter:this is advantageous for eliminating the need for an additional promoterelement (which can take up space in the vector). The additional spacefreed up can be used to drive the expression of additional elements(gRNA, etc.). Also, ITR activity is relatively weaker, so can be used toreduce potential toxicity due to over expression of nucleicacid-targeting effector protein. For ubiquitous expression, can usepromoters: CMV, CAG, CBh, PGK, SV40, Ferritin heavy or light chains,etc. For brain or other CNS expression, can use promoters: SynapsinI forall neurons, CaMKIIalpha for excitatory neurons, GAD67 or GAD65 or VGATfor GABAergic neurons, etc. For liver expression, can use Albuminpromoter. For lung expression, can use SP-B. For endothelial cells, canuse ICAM. For hematopoietic cells can use IFNbeta or CD45. ForOsteoblasts can use OG-2. The promoter used to drive guide RNA caninclude: Pol III promoters such as U6 or H1; Pol II promoter andintronic cassettes to express guide RNA or crRNA.

Adeno Associated Virus (AAV)

CRISPR-Cas and one or more guide RNA or crRNA can be delivered usingadeno associated virus (AAV), lentivirus, adenovirus or other plasmid orviral vector types, in particular, using formulations and doses from,for example, U.S. Pat. No. 8,454,972 (formulations, doses foradenovirus), U.S. Pat. No. 8,404,658 (formulations, doses for AAV) and5,846,946 (formulations, doses for DNA plasmids) and from clinicaltrials and publications regarding the clinical trials involvinglentivirus, AAV and adenovirus. For examples, for AAV, the route ofadministration, formulation and dose can be as in U.S. Pat. No.8,454,972 and as in clinical trials involving AAV. For Adenovirus, theroute of administration, formulation and dose can be as in U.S. Pat. No.8,404,658 and as in clinical trials involving adenovirus. For plasmiddelivery, the route of administration, formulation and dose can be as inU.S. Pat. No. 5,846,946 and as in clinical studies involving plasmids.Doses may be based on or extrapolated to an average 70 kg individual(e.g., a male adult human), and can be adjusted for patients, subjects,mammals of different weight and species. Frequency of administration iswithin the ambit of the medical or veterinary practitioner (e.g.,physician, veterinarian), depending on usual factors including the age,sex, general health, other conditions of the patient or subject and theparticular condition or symptoms being addressed. The viral vectors canbe injected into the tissue of interest. For cell-type specific genomemodification, the expression of RNA-targeting effector protein(CRISPR-Cas effector protein) can be driven by a cell-type specificpromoter. For example, liver-specific expression might use the Albuminpromoter and neuron-specific expression (e.g., for targeting CNSdisorders) might use the Synapsin I promoter. In terms of in vivodelivery, AAV is advantageous over other viral vectors for a couple ofreasons: Low toxicity (this may be due to the purification method notrequiring ultra centrifugation of cell particles that can activate theimmune response) and Low probability of causing insertional mutagenesisbecause it doesn't integrate into the host genome.

AAV has a packaging limit of 4.5 or 4.75 Kb. This means that theRNA-targeting effector protein (CRISPR-Cas effector protein) codingsequence as well as a promoter and transcription terminator have to beall fit into the same viral vector. As to AAV, the AAV can be AAV1,AAV2, AAVS or any combination thereof. One can select the AAV of the AAVwith regard to the cells to be targeted; e.g., one can select AAVserotypes 1, 2, 5 or a hybrid capsid AAV1, AAV2, AAVS or any combinationthereof for targeting brain or neuronal cells; and one can select AAV4for targeting cardiac tissue. AAV8 is useful for delivery to the liver.The herein promoters and vectors are preferred individually. Atabulation of certain AAV serotypes as to these cells (see Grimm, D. etal, J. Virol. 82: 5887-5911 (2008)) is as follows:

TABLE 9 Cell Line AAV-1 AAV-2 AAV-3 AAV-4 AAV-5 AAV-6 AAV-8 AAV-9 Huh-713 100 2.5 0.0 0.1 10 0.7 0.0 HEK293 25 100 2.5 0.1 0.1 5 0.7 0.1 HeLa 3100 2.0 0.1 6.7 1 0.2 0.1 HepG2 3 100 16.7 0.3 1.7 5 0.3 ND Hep1A 20 1000.2 1.0 0.1 1 0.2 0.0 911 17 100 11 0.2 0.1 17 0.1 ND CHO 100 100 14 1.4333 50 10 1.0 COS 33 100 33 3.3 5.0 14 2.0 0.5 MeWo 10 100 20 0.3 6.7 101.0 0.2 NIH3T3 10 100 2.9 2.9 0.3 10 0.3 ND A549 14 100 20 ND 0.5 10 0.50.1 HT1180 20 100 10 0.1 0.3 33 0.5 0.1 Monocytes 1111 100 ND ND 1251429 ND ND Immature DC 2500 100 ND ND 222 2857 ND ND Mature DC 2222 100ND ND 333 3333 ND ND

Lentivirus

Lentiviruses are complex retroviruses that have the ability to infectand express their genes in both mitotic and post-mitotic cells. The mostcommonly known lentivirus is the human immunodeficiency virus (HIV),which uses the envelope glycoproteins of other viruses to target a broadrange of cell types. Lentiviruses may be prepared as follows. Aftercloning pCasES10 (which contains a lentiviral transfer plasmidbackbone), HEK293FT at low passage (p=5) were seeded in a T-75 flask to50% confluence the day before transfection in DMEM with 10% fetal bovineserum and without antibiotics. After 20 hours, media was changed toOptiMEM (serum-free) media and transfection was done 4 hours later.Cells were transfected with 10 μg of lentiviral transfer plasmid(pCasES10) and the following packaging plasmids: 5 μg of pMD2.G (VSV-gpseudotype), and 7.5 ug of psPAX2 (gag/pol/rev/tat). Transfection wasdone in 4 mL OptiMEM with a cationic lipid delivery agent (50 uLLipofectamine 2000 and 100 ul Plus reagent). After 6 hours, the mediawas changed to antibiotic-free DMEM with 10% fetal bovine serum. Thesemethods use serum during cell culture, but serum-free methods arepreferred.

Lentivirus may be purified as follows. Viral supernatants were harvestedafter 48 hours. Supernatants were first cleared of debris and filteredthrough a 0.45 um low protein binding (PVDF) filter. They were then spunin a ultracentrifuge for 2 hours at 24,000 rpm. Viral pellets wereresuspended in 50 ul of DMEM overnight at 4 C. They were then aliquottedand immediately frozen at −80° C.

In another embodiment, minimal non-primate lentiviral vectors based onthe equine infectious anemia virus (EIAV) are also contemplated,especially for ocular gene therapy (see, e.g., Balagaan, J Gene Med2006; 8: 275-285). In another embodiment, RetinoStat®, an equineinffctious anemia virus-based lentiviral gene therapy vector thatexpresses angiostatic proteins endostatin and angiostatin that isdelivered via a subretinal injection for the treatment of the web formof age-related macular degeneration is also contemplated (see, e.g.,Binley et al., HUMAN GENE THERAPY 23:980-991 (September 2012)) and thisvector may be modified for the nucleic acid-targeting system of thepresent invention.

In another embodiment, self-inactivating lentiviral vectors with ansiRNA targeting a common exon shared by HIV tat/rev, anucleolar-localizing TAR decoy, and an anti-CCR5-specific hammerheadribozyme (see, e.g., DiGiusto et al. (2010) Sci Transl Med 2:36ra43) maybe used/ and or adapted to the nucleic acid-targeting system of thepresent invention. A minimum of 2.5×106 CD34+ cells per kilogram patientweight may be collected and prestimulated for 16 to 20 hours in X-VIVO15 medium (Lonza) containing 2 μmol/L-glutamine, stem cell factor (100ng/ml), Flt-3 ligand (Flt-3L) (100 ng/ml), and thrombopoietin (10 ng/ml)(CellGenix) at a density of 2×106 cells/ml. Prestimulated cells may betransduced with lentiviral at a multiplicity of infection of 5 for 16 to24 hours in 75-cm2 tissue culture flasks coated with fibronectin (25mg/cm2) (RetroNectin, Takara Bio Inc.).

Lentiviral vectors have been disclosed as in the treatment forParkinson's Disease, see, e.g., US Patent Publication No. 20120295960and U.S. Pat. Nos. 7,303,910 and 7,351,585. Lentiviral vectors have alsobeen disclosed for the treatment of ocular diseases, see e.g., US PatentPublication Nos. 20060281180, 20090007284, US20110117189; US20090017543;US20070054961, US20100317109. Lentiviral vectors have also beendisclosed for delivery to the brain, see, e.g., US Patent PublicationNos. US20110293571; US20110293571, US20040013648, US20070025970,US20090111106 and U.S. Pat. No. 7,259,015.

RNA Delivery

RNA delivery: The nucleic acid-targeting CRISPR-Cas protein, and/orguide RNA, can also be delivered in the form of RNA. mRNA can besynthesized using a PCR cassette containing the following elements:T7_promoter-kozak sequence (GCCACC)-effector protrein-3′ UTR from betaglobin-polyA tail (a string of 120 or more adenines). The cassette canbe used for transcription by T7 polymerase. Guide RNAs or crRNAs canalso be transcribed using in vitro transcription from a cassettecontaining T7_promoter-GG-guide RNA or crRNA sequence.

Particle Delivery Systems and/or Formulations:

Several types of particle delivery systems and/or formulations are knownto be useful in a diverse spectrum of biomedical applications. Ingeneral, a particle is defined as a small object that behaves as a wholeunit with respect to its transport and properties. Particles are furtherclassified according to diameter. Coarse particles cover a range between2,500 and 10,000 nanometers. Fine particles are sized between 100 and2,500 nanometers. Ultrafine particles, or nanoparticles, are generallybetween 1 and 100 nanometers in size. The basis of the 100-nm limit isthe fact that novel properties that differentiate particles from thebulk material typically develop at a critical length scale of under 100nm.

As used herein, a particle delivery system/formulation is defined as anybiological delivery system/formulation which includes a particle inaccordance with the present invention. A particle in accordance with thepresent invention is any entity having a greatest dimension (e.g.diameter) of less than 100 microns (μm). In some embodiments, inventiveparticles have a greatest dimension of less than 10 μm. In someembodiments, inventive particles have a greatest dimension of less than2000 nanometers (nm). In some embodiments, inventive particles have agreatest dimension of less than 1000 nanometers (nm). In someembodiments, inventive particles have a greatest dimension of less than900 nm, 800 nm, 700 nm, 600 nm, 500 nm, 400 nm, 300 nm, 200 nm, or 100nm. Typically, inventive particles have a greatest dimension (e.g.,diameter) of 500 nm or less. In some embodiments, inventive particleshave a greatest dimension (e.g., diameter) of 250 nm or less. In someembodiments, inventive particles have a greatest dimension (e.g.,diameter) of 200 nm or less. In some embodiments, inventive particleshave a greatest dimension (e.g., diameter) of 150 nm or less. In someembodiments, inventive particles have a greatest dimension (e.g.,diameter) of 100 nm or less. Smaller particles, e.g., having a greatestdimension of 50 nm or less are used in some embodiments of theinvention. In some embodiments, inventive particles have a greatestdimension ranging between 25 nm and 200 nm.

Particle characterization (including e.g., characterizing morphology,dimension, etc.) is done using a variety of different techniques. Commontechniques are electron microscopy (TEM, SEM), atomic force microscopy(AFM), dynamic light scattering (DLS), X-ray photoelectron spectroscopy(XPS), powder X-ray diffraction (XRD), Fourier transform infraredspectroscopy (FTIR), matrix-assisted laser desorption/ionizationtime-of-flight mass spectrometry (MALDI-TOF), ultraviolet-visiblespectroscopy, dual polarisation interferometry and nuclear magneticresonance (NMR). Characterization (dimension measurements) may be madeas to native particles (i.e., preloading) or after loading of the cargo(herein cargo refers to e.g., one or more components of CRISPR-Cassystem e.g., CRISPR-Cas enzyme or mRNA or guide RNA, or any combinationthereof, and may include additional carriers and/or excipients) toprovide particles of an optimal size for delivery for any in vitro, exvivo and/or in vivo application of the present invention. In certainpreferred embodiments, particle dimension (e.g., diameter)characterization is based on measurements using dynamic laser scattering(DLS). Mention is made of U.S. Pat. Nos. 8,709,843; 6,007,845;5,855,913; 5,985,309; 5,543,158; and the publication by James E. Dahlmanand Carmen Barnes et al. Nature Nanotechnology (2014) published online11 May 2014, doi:10.1038/nnano.2014.84, concerning particles, methods ofmaking and using them and measurements thereof. See also Dahlman et al.“Orthogonal gene control with a catalytically active Cas9 nuclease,”Nature Biotechnology 33, 1159-1161 (November, 2015)

Particles delivery systems within the scope of the present invention maybe provided in any form, including but not limited to solid, semi-solid,emulsion, or colloidal particles. As such any of the delivery systemsdescribed herein, including but not limited to, e.g., lipid-basedsystems, liposomes, micelles, microvesicles, exosomes, or gene gun maybe provided as particle delivery systems within the scope of the presentinvention.

Particles

CRISPR-Cas mRNA and guide RNA or crRNA may be delivered simultaneouslyusing particles or lipid envelopes; for instance, CRISPR enzyme and RNAof the invention, e.g., as a complex, can be delivered via a particle asin Dahlman et al., WO2015089419 A2 and documents cited therein, such as7C1 (see, e.g., James E. Dahlman and Carmen Barnes et al. NatureNanotechnology (2014) published online 11 May 2014,doi:10.1038/nnano.2014.84), e.g., delivery particle comprising lipid orlipidoid and hydrophilic polymer, e.g., cationic lipid and hydrophilicpolymer, for instance wherein the cationic lipid comprises1,2-dioleoyl-3-trimethylammonium-propane (DOTAP) or1,2-ditetradecanoyl-sn-glycero-3-phosphocholine (DMPC) and/or whereinthe hydrophilic polymer comprises ethylene glycol or polyethylene glycol(PEG); and/or wherein the particle further comprises cholesterol (e.g.,particle from formulation 1=DOTAP 100, DMPC 0, PEG 0, Cholesterol 0;formulation number 2=DOTAP 90, DMPC 0, PEG 10, Cholesterol 0;formulation number 3=DOTAP 90, DMPC 0, PEG 5, Cholesterol 5), whereinparticles are formed using an efficient, multistep process whereinfirst, effector protein and RNA are mixed together, e.g., at a 1:1 molarratio, e.g., at room temperature, e.g., for 30 minutes, e.g., insterile, nuclease free 1×PBS; and separately, DOTAP, DMPC, PEG, andcholesterol as applicable for the formulation are dissolved in alcohol,e.g., 100% ethanol; and, the two solutions are mixed together to formparticles containing the complexes). CRISPR-Cas effector protein mRNAand guide RNA may be delivered simultaneously using particles or lipidenvelopes. This Dahlman et al technology can be applied in the instantinvention. An epoxide-modified lipid-polymer may be utilized to deliverthe nucleic acid-targeting system of the present invention to pulmonary,cardiovascular or renal cells, however, one of skill in the art mayadapt the system to deliver to other target organs. Dosage ranging fromabout 0.05 to about 0.6 mg/kg are envisioned. Dosages over several daysor weeks are also envisioned, with a total dosage of about 2 mg/kg. Forexample, Su X, Fricke J, Kavanagh D G, Irvine D J (“In vitro and in vivomRNA delivery using lipid-enveloped pH-responsive polymer nanoparticles”Mol Pharm. 2011 Jun. 6; 8(3):774-87. doi: 10.1021/mp100390w. Epub 2011Apr. 1) describes biodegradable core-shell structured particles with apoly(β-amino ester) (PBAE) core enveloped by a phospholipid bilayershell. These were developed for in vivo mRNA delivery. The pH-responsivePBAE component was chosen to promote endosome disruption, while thelipid surface layer was selected to minimize toxicity of the polycationcore. Such are, therefore, preferred for delivering RNA of the presentinvention.

In one embodiment, particles based on self-assembling bioadhesivepolymers are contemplated, which may be applied to oral delivery ofpeptides, intravenous delivery of peptides and nasal delivery ofpeptides, all to the brain. Other embodiments, such as oral absorptionand ocular delivery of hydrophobic drugs are also contemplated. Themolecular envelope technology involves an engineered polymer envelopewhich is protected and delivered to the site of the disease (see, e.g.,Mazza, M. et al. ACSNano, 2013. 7(2): 1016-1026; Siew, A., et al. MolPharm, 2012. 9(1):14-28; Lalatsa, A., et al. J Contr Rel, 2012.161(2):523-36; Lalatsa, A., et al., Mol Pharm, 2012. 9(6):1665-80;Lalatsa, A., et al. Mol Pharm, 2012. 9(6):1764-74; Garrett, N. L., etal. J Biophotonics, 2012. 5(5-6):458-68; Garrett, N. L., et al. J RamanSpect, 2012. 43(5):681-688; Ahmad, S., et al. J Royal Soc Interface2010. 7:S423-33; Uchegbu, I. F. Expert Opin Drug Deliv, 2006.3(5):629-40; Qu, X., et al. Biomacromolecules, 2006. 7(12):3452-9 andUchegbu, I. F., et al. Int J Pharm, 2001. 224:185-199). Doses of about 5mg/kg are contemplated, with single or multiple doses, depending on thetarget tissue.

Regarding particles, see, also Alabi et al., Proc Natl Acad Sci USA.2013 Aug. 6; 110(32):12881-6; Zhang et al., Adv Mater. 2013 Sep. 6;25(33):4641-5; Jiang et al., Nano Lett. 2013 Mar. 13; 13(3):1059-64;Karagiannis et al., ACS Nano. 2012 Oct. 23; 6(10):8484-7; Whitehead etal., ACS Nano. 2012 Aug. 28; 6(8):6922-9 and Lee et al., NatNanotechnol. 2012 Jun. 3; 7(6):389-93.

US patent application 20110293703 relates to lipidoid compounds are alsoparticularly useful in the administration of polynucleotides, which maybe applied to deliver the nucleic acid-targeting system of the presentinvention. In one aspect, the aminoalcohol lipidoid compounds arecombined with an agent to be delivered to a cell or a subject to formmicroparticles, nanoparticles, liposomes, or micelles. The agent to bedelivered by the particles, liposomes, or micelles may be in the form ofa gas, liquid, or solid, and the agent may be a polynucleotide, protein,peptide, or small molecule. The aminoalcohol lipidoid compounds may becombined with other aminoalcohol lipidoid compounds, polymers (syntheticor natural), surfactants, cholesterol, carbohydrates, proteins, lipids,etc. to form the particles. These particles may then optionally becombined with a pharmaceutical excipient to form a pharmaceuticalcomposition. US Patent Publication No. 20110293703 also provides methodsof preparing the aminoalcohol lipidoid compounds. One or moreequivalents of an amine are allowed to react with one or moreequivalents of an epoxide-terminated compound under suitable conditionsto form an aminoalcohol lipidoid compound of the present invention. Incertain embodiments, all the amino groups of the amine are fully reactedwith the epoxide-terminated compound to form tertiary amines. In otherembodiments, all the amino groups of the amine are not fully reactedwith the epoxide-terminated compound to form tertiary amines therebyresulting in primary or secondary amines in the aminoalcohol lipidoidcompound. These primary or secondary amines are left as is or may bereacted with another electrophile such as a different epoxide-terminatedcompound. As will be appreciated by one skilled in the art, reacting anamine with less than excess of epoxide-terminated compound will resultin a plurality of different aminoalcohol lipidoid compounds with variousnumbers of tails. Certain amines may be fully functionalized with twoepoxide-derived compound tails while other molecules will not becompletely functionalized with epoxide-derived compound tails. Forexample, a diamine or polyamine may include one, two, three, or fourepoxide-derived compound tails off the various amino moieties of themolecule resulting in primary, secondary, and tertiary amines. Incertain embodiments, all the amino groups are not fully functionalized.In certain embodiments, two of the same types of epoxide-terminatedcompounds are used. In other embodiments, two or more differentepoxide-terminated compounds are used. The synthesis of the aminoalcohollipidoid compounds is performed with or without solvent, and thesynthesis may be performed at higher temperatures ranging from 30-100°C., preferably at approximately 50-90° C. The prepared aminoalcohollipidoid compounds may be optionally purified. For example, the mixtureof aminoalcohol lipidoid compounds may be purified to yield anaminoalcohol lipidoid compound with a particular number ofepoxide-derived compound tails. Or the mixture may be purified to yielda particular stereo- or regioisomer. The aminoalcohol lipidoid compoundsmay also be alkylated using an alkyl halide (e.g., methyl iodide) orother alkylating agent, and/or they may be acylated.

US Patent Publication No. 20110293703 also provides libraries ofaminoalcohol lipidoid compounds prepared by the inventive methods. Theseaminoalcohol lipidoid compounds may be prepared and/or screened usinghigh-throughput techniques involving liquid handlers, robots, microtiterplates, computers, etc. In certain embodiments, the aminoalcohollipidoid compounds are screened for their ability to transfectpolynucleotides or other agents (e.g., proteins, peptides, smallmolecules) into the cell. US Patent Publication No. 20130302401 relatesto a class of poly(beta-amino alcohols) (PBAAs) has been prepared usingcombinatorial polymerization. The inventive PBAAs may be used inbiotechnology and biomedical applications as coatings (such as coatingsof films or multilayer films for medical devices or implants),additives, materials, excipients, non-biofouling agents, micropatterningagents, and cellular encapsulation agents. When used as surfacecoatings, these PBAAs elicited different levels of inflammation, both invitro and in vivo, depending on their chemical structures. The largechemical diversity of this class of materials allowed us to identifypolymer coatings that inhibit macrophage activation in vitro.Furthermore, these coatings reduce the recruitment of inflammatorycells, and reduce fibrosis, following the subcutaneous implantation ofcarboxylated polystyrene microparticles. These polymers may be used toform polyelectrolyte complex capsules for cell encapsulation. Theinvention may also have many other biological applications such asantimicrobial coatings, DNA or siRNA delivery, and stem cell tissueengineering. The teachings of US Patent Publication No. 20130302401 maybe applied to the nucleic acid-targeting system of the presentinvention.

In another embodiment, lipid nanoparticles (LNPs) are contemplated. Anantitransthyretin small interfering RNA has been encapsulated in lipidnanoparticles and delivered to humans (see, e.g., Coelho et al., N EnglJ Med 2013; 369:819-29), and such a system may be adapted and applied tothe nucleic acid-targeting system of the present invention. Doses ofabout 0.01 to about 1 mg per kg of body weight administeredintravenously are contemplated. Medications to reduce the risk ofinfusion-related reactions are contemplated, such as dexamethasone,acetampinophen, diphenhydramine or cetirizine, and ranitidine arecontemplated. Multiple doses of about 0.3 mg per kilogram every 4 weeksfor five doses are also contemplated. LNPs have been shown to be highlyeffective in delivering siRNAs to the liver (see, e.g., Tabernero etal., Cancer Discovery, April 2013, Vol. 3, No. 4, pages 363-470) and aretherefore contemplated for delivering RNA encoding nucleicacid-targeting effector protein to the liver. A dosage of about fourdoses of 6 mg/kg of the LNP every two weeks may be contemplated.Tabernero et al. demonstrated that tumor regression was observed afterthe first 2 cycles of LNPs dosed at 0.7 mg/kg, and by the end of 6cycles the patient had achieved a partial response with completeregression of the lymph node metastasis and substantial shrinkage of theliver tumors. A complete response was obtained after 40 doses in thispatient, who has remained in remission and completed treatment afterreceiving doses over 26 months. Two patients with RCC and extrahepaticsites of disease including kidney, lung, and lymph nodes that wereprogressing following prior therapy with VEGF pathway inhibitors hadstable disease at all sites for approximately 8 to 12 months, and apatient with PNET and liver metastases continued on the extension studyfor 18 months (36 doses) with stable disease. However, the charge of theLNP must be taken into consideration. As cationic lipids combined withnegatively charged lipids to induce nonbilayer structures thatfacilitate intracellular delivery. Because charged LNPs are rapidlycleared from circulation following intravenous injection, ionizablecationic lipids with pKa values below 7 were developed (see, e.g., Rosinet al, Molecular Therapy, vol. 19, no. 12, pages 1286-2200, December2011). Negatively charged polymers such as RNA may be loaded into LNPsat low pH values (e.g., pH 4) where the ionizable lipids display apositive charge. However, at physiological pH values, the LNPs exhibit alow surface charge compatible with longer circulation times. Fourspecies of ionizable cationic lipids have been focused upon, namely1,2-dilineoyl-3-dimethylammonium-propane (DLinDAP),1,2-dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA),1,2-dilinoleyloxy-keto-N,N-dimethyl-3-aminopropane (DLinKDMA), and1,2-dilinoleyl-4-(2-dimethylaminoethyl)-[1,3]-dioxolane (DLinKC2-DMA).It has been shown that LNP siRNA systems containing these lipids exhibitremarkably different gene silencing properties in hepatocytes in vivo,with potencies varying according to the seriesDLinKC2-DMA>DLinKDMA>DLinDMA>>DLinDAP employing a Factor VII genesilencing model (see, e.g., Rosin et al, Molecular Therapy, vol. 19, no.12, pages 1286-2200, December 2011). A dosage of 1 μg/ml of LNP orCRISPR-Cas RNA in or associated with the LNP may be contemplated,especially for a formulation containing DLinKC2-DMA.

Preparation of LNPs and CRISPR-Cas encapsulation may be used/ and oradapted from Rosin et al, Molecular Therapy, vol. 19, no. 12, pages1286-2200, December 2011). The cationic lipids1,2-dilineoyl-3-dimethylammonium-propane (DLinDAP),1,2-dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA),1,2-dilinoleyloxyketo-N,N-dimethyl-3-aminopropane (DLinK-DMA),1,2-dilinoleyl-4-(2-dimethylaminoethyl)[1,3]-dioxolane (DLinKC2-DMA),(3-o-[2″-(methoxypolyethyleneglycol 2000)succinoyl]-1,2-dimyristoyl-sn-glycol (PEG-S-DMG), andR-3-[(w-methoxy-poly(ethylene glycol)2000)carbamoyl]-1,2-dimyristyloxlpropyl-3-amine (PEG-C-DOMG) may be providedby Tekmira Pharmaceuticals (Vancouver, Canada) or synthesized.Cholesterol may be purchased from Sigma (St Louis, Mo.). The specificnucleic acid-targeting complex (CRISPR-Cas) RNA may be encapsulated inLNPs containing DLinDAP, DLinDMA, DLinK-DMA, and DLinKC2-DMA (cationiclipid:DSPC:CHOL:PEGS-DMG or PEG-C-DOMG at 40:10:40:10 molar ratios).When required, 0.2% SP-DiOC18 (Invitrogen, Burlington, Canada) may beincorporated to assess cellular uptake, intracellular delivery, andbiodistribution. Encapsulation may be performed by dissolving lipidmixtures comprised of cationic lipid:DSPC:cholesterol:PEG-c-DOMG(40:10:40:10 molar ratio) in ethanol to a final lipid concentration of10 mmol/1. This ethanol solution of lipid may be added drop-wise to 50mmol/1 citrate, pH 4.0 to form multilamellar vesicles to produce a finalconcentration of 30% ethanol vol/vol. Large unilamellar vesicles may beformed following extrusion of multilamellar vesicles through two stacked80 nm Nuclepore polycarbonate filters using the Extruder (NorthernLipids, Vancouver, Canada). Encapsulation may be achieved by adding RNAdissolved at 2 mg/ml in 50 mmol/1 citrate, pH 4.0 containing 30% ethanolvol/vol drop-wise to extruded preformed large unilamellar vesicles andincubation at 31° C. for 30 minutes with constant mixing to a finalRNA/lipid weight ratio of 0.06/1 wt/wt. Removal of ethanol andneutralization of formulation buffer were performed by dialysis againstphosphate-buffered saline (PBS), pH 7.4 for 16 hours using Spectra/Por 2regenerated cellulose dialysis membranes. Particle size distribution maybe determined by dynamic light scattering using a NICOMP 370 particlesizer, the vesicle/intensity modes, and Gaussian fitting (NicompParticle Sizing, Santa Barbara, Calif.). The particle size for all threeLNP systems may be ˜70 nm in diameter. RNA encapsulation efficiency maybe determined by removal of free RNA using VivaPureD MiniH columns(Sartorius Stedim Biotech) from samples collected before and afterdialysis. The encapsulated RNA may be extracted from the elutedparticles and quantified at 260 nm. RNA to lipid ratio was determined bymeasurement of cholesterol content in vesicles using the Cholesterol Eenzymatic assay from Wako Chemicals USA (Richmond, Va.). In conjunctionwith the herein discussion of LNPs and PEG lipids, PEGylated liposomesor LNPs are likewise suitable for delivery of a nucleic acid-targetingsystem or components thereof. Preparation of large LNPs may be used/ andor adapted from Rosin et al, Molecular Therapy, vol. 19, no. 12, pages1286-2200, December 2011. A lipid premix solution (20.4 mg/ml totallipid concentration) may be prepared in ethanol containing DLinKC2-DMA,DSPC, and cholesterol at 50:10:38.5 molar ratios. Sodium acetate may beadded to the lipid premix at a molar ratio of 0.75:1 (sodiumacetate:DLinKC2-DMA). The lipids may be subsequently hydrated bycombining the mixture with 1.85 volumes of citrate buffer (10 mmol/1, pH3.0) with vigorous stirring, resulting in spontaneous liposome formationin aqueous buffer containing 35% ethanol. The liposome solution may beincubated at 37° C. to allow for time-dependent increase in particlesize. Aliquots may be removed at various times during incubation toinvestigate changes in liposome size by dynamic light scattering(Zetasizer Nano Z S, Malvern Instruments, Worcestershire, UK). Once thedesired particle size is achieved, an aqueous PEG lipid solution(stock=10 mg/ml PEG-DMG in 35% (vol/vol) ethanol) may be added to theliposome mixture to yield a final PEG molar concentration of 3.5% oftotal lipid. Upon addition of PEG-lipids, the liposomes should theirsize, effectively quenching further growth. RNA may then be added to theempty liposomes at a RNA to total lipid ratio of approximately 1:10(wt:wt), followed by incubation for 30 minutes at 37° C. to form loadedLNPs. The mixture may be subsequently dialyzed overnight in PBS andfiltered with a 0.45-μm syringe filter.

Spherical Nucleic Acid (SNA™) constructs and other particles(particularly gold particles) are also contemplated as a means todelivery nucleic acid-targeting system to intended targets. Significantdata show that AuraSense Therapeutics' Spherical Nucleic Acid (SNA™)constructs, based upon nucleic acid-functionalized gold particles, areuseful.

Literature that may be employed in conjunction with herein teachingsinclude: Cutler et al., J. Am. Chem. Soc. 2011 133:9254-9257, Hao etal., Small. 2011 7:3158-3162, Zhang et al., ACS Nano. 2011 5:6962-6970,Cutler et al., J. Am. Chem. Soc. 2012 134:1376-1391, Young et al., NanoLett. 2012 12:3867-71, Zheng et al., Proc. Natl. Acad. Sci. USA. 2012109:11975-80, Mirkin, Nanomedicine 2012 7:635-638 Zhang et al., J. Am.Chem. Soc. 2012 134:16488-1691, Weintraub, Nature 2013 495:S14-S16, Choiet al., Proc. Natl. Acad. Sci. USA. 2013 110(19):7625-7630, Jensen etal., Sci. Transl. Med. 5, 209ra152 (2013) and Mirkin, et al., Small,10:186-192.

Self-assembling particles with RNA may be constructed withpolyethyleneimine (PEI) that is PEGylated with an Arg-Gly-Asp (RGD)peptide ligand attached at the distal end of the polyethylene glycol(PEG). This system has been used, for example, as a means to targettumor neovasculature expressing integrins and deliver siRNA inhibitingvascular endothelial growth factor receptor-2 (VEGF R2) expression andthereby achieve tumor angiogenesis (see, e.g., Schiffelers et al.,Nucleic Acids Research, 2004, Vol. 32, No. 19). Nanoplexes may beprepared by mixing equal volumes of aqueous solutions of cationicpolymer and nucleic acid to give a net molar excess of ionizablenitrogen (polymer) to phosphate (nucleic acid) over the range of 2 to 6.The electrostatic interactions between cationic polymers and nucleicacid resulted in the formation of polyplexes with average particle sizedistribution of about 100 nm, hence referred to here as nanoplexes. Adosage of about 100 to 200 mg of nucleic acid-targeting complex RNA isenvisioned for delivery in the self-assembling particles of Schiffelerset al.

The nanoplexes of Bartlett et al. (PNAS, Sep. 25, 2007,vol. 104, no. 39)may also be applied to the present invention. The nanoplexes of Bartlettet al. are prepared by mixing equal volumes of aqueous solutions ofcationic polymer and nucleic acid to give a net molar excess ofionizable nitrogen (polymer) to phosphate (nucleic acid) over the rangeof 2 to 6. The electrostatic interactions between cationic polymers andnucleic acid resulted in the formation of polyplexes with averageparticle size distribution of about 100 nm, hence referred to here asnanoplexes. The DOTA-siRNA of Bartlett et al. was synthesized asfollows: 1,4,7,10-tetraazacyclododecane-1,4,7,10-tetraacetic acidmono(N-hydroxysuccinimide ester) (DOTA-NHSester) was ordered fromMacrocyclics (Dallas, Tex.). The amine modified RNA sense strand with a100-fold molar excess of DOTA-NHS-ester in carbonate buffer (pH 9) wasadded to a microcentrifuge tube. The contents were reacted by stirringfor 4 h at room temperature. The DOTA-RNAsense conjugate wasethanol-precipitated, resuspended in water, and annealed to theunmodified antisense strand to yield DOTA-siRNA. All liquids werepretreated with Chelex-100 (Bio-Rad, Hercules, Calif.) to remove tracemetal contaminants. Tf-targeted and nontargeted siRNA particles may beformed by using cyclodextrin-containing polycations. Typically,particles were formed in water at a charge ratio of 3 (+/−) and an siRNAconcentration of 0.5 g/liter. One percent of the adamantane-PEGmolecules on the surface of the targeted particles were modified with Tf(adamantane-PEG-Tf). The particles were suspended in a 5% (wt/vol)glucose carrier solution for injection.

Davis et al. (Nature, Vol 464, 15 Apr. 2010) conducts a RNA clinicaltrial that uses a targeted particle-delivery system (clinical trialregistration number NCT00689065). Patients with solid cancers refractoryto standard-of-care therapies are administered doses of targetedparticles on days 1, 3, 8 and 10 of a 21-day cycle by a 30-minintravenous infusion. The particles comprise, consist essentially of, orconsist of a synthetic delivery system containing: (1) a linear,cyclodextrin-based polymer (CDP), (2) a human transferrin protein (TF)targeting ligand displayed on the exterior of the nanoparticle to engageTF receptors (TFR) on the surface of the cancer cells, (3) a hydrophilicpolymer (polyethylene glycol (PEG) used to promote nanoparticlestability in biological fluids), and (4) siRNA designed to reduce theexpression of the RRM2 (sequence used in the clinic was previouslydenoted siR2B+5). The TFR has long been known to be upregulated inmalignant cells, and RRM2 is an established anti-cancer target. Theseparticles (clinical version denoted as CALAA-01) have been shown to bewell tolerated in multi-dosing studies in non-human primates. Although asingle patient with chronic myeloid leukemia has been administered siRNAby liposomal delivery, Davis et al.'s clinical trial is the initialhuman trial to systemically deliver siRNA with a targeted deliverysystem and to treat patients with solid cancer. To ascertain whether thetargeted delivery system can provide effective delivery of functionalsiRNA to human tumours, Davis et al. investigated biopsies from threepatients from three different dosing cohorts; patients A, B and C, allof whom had metastatic melanoma and received CALAA-01 doses of 18, 24and 30 mg m-2 siRNA, respectively. Similar doses may also becontemplated for the nucleic acid-targeting system of the presentinvention. The delivery of the invention may be achieved with particlescontaining a linear, cyclodextrin-based polymer (CDP), a humantransferrin protein (TF) targeting ligand displayed on the exterior ofthe particle to engage TF receptors (TFR) on the surface of the cancercells and/or a hydrophilic polymer (for example, polyethylene glycol(PEG) used to promote particle stability in biological fluids).

In terms of this invention, it is preferred to have one or morecomponents of RNA-targeting complex, e.g., nucleic acid-targetingeffector (CRISPR-Cas) protein or mRNA therefor, or guide RNA or crRNAdelivered using particles or lipid envelopes. Other delivery systems orvectors are may be used in conjunction with the particle aspects of theinvention. Particles encompassed in the present invention may beprovided in different forms, e.g., as solid particles (e.g., metal suchas silver, gold, iron, titanium), non-metal, lipid-based solids,polymers), suspensions of particles, or combinations thereof. Metal,dielectric, and semiconductor particles may be prepared, as well ashybrid structures (e.g., core-shell particles). Particles made ofsemiconducting material may also be labeled quantum dots if they aresmall enough (typically sub 10 nm) that quantization of electronicenergy levels occurs. Such nanoscale particles are used in biomedicalapplications as drug carriers or imaging agents and may be adapted forsimilar purposes in the present invention.

Semi-solid and soft particles have been manufactured, and are within thescope of the present invention. A prototype particle of semi-solidnature is the liposome. Various types of liposome particles arecurrently used clinically as delivery systems for anticancer drugs andvaccines. Particles with one half hydrophilic and the other halfhydrophobic are termed Janus particles and are particularly effectivefor stabilizing emulsions. They can self-assemble at water/oilinterfaces and act as solid surfactants.

U.S. Pat. No. 8,709,843, incorporated herein by reference, provides adrug delivery system for targeted delivery of therapeuticagent-containing particles to tissues, cells, and intracellularcompartments. The invention provides targeted particles comprisingpolymer conjugated to a surfactant, hydrophilic polymer or lipid. U.S.Pat. No. 6,007,845, incorporated herein by reference, provides particleswhich have a core of a multiblock copolymer formed by covalently linkinga multifunctional compound with one or more hydrophobic polymers and oneor more hydrophilic polymers, and contain a biologically activematerial. U.S. Pat. No. 5,855,913, incorporated herein by reference,provides a particulate composition having aerodynamically lightparticles having a tap density of less than 0.4 g/cm3 with a meandiameter of between 5 μm and 30 μm, incorporating a surfactant on thesurface thereof for drug delivery to the pulmonary system. U.S. Pat. No.5,985,309, incorporated herein by reference, provides particlesincorporating a surfactant and/or a hydrophilic or hydrophobic complexof a positively or negatively charged therapeutic or diagnostic agentand a charged molecule of opposite charge for delivery to the pulmonarysystem. U.S. Pat. No. 5,543,158, incorporated herein by reference,provides biodegradable injectable particles having a biodegradable solidcore containing a biologically active material and poly(alkylene glycol)moieties on the surface. WO2012135025 (also published as US20120251560),incorporated herein by reference, describes conjugated polyethyleneimine(PEI) polymers and conjugated aza-macrocycles (collectively referred toas “conjugated lipomer” or “lipomers”). In certain embodiments, it canbe envisioned that such methods and materials of herein-cited documents,e.g., conjugated lipomers can be used in the context of the nucleicacid-targeting system to achieve in vitro, ex vivo and in vivo genomicperturbations to modify gene expression, including modulation of proteinexpression.

Exosomes

Exosomes are endogenous nano-vesicles that transport RNAs and proteins,and which can deliver RNA to the brain and other target organs. Toreduce immunogenicity, Alvarez-Erviti et al. (2011, Nat Biotechnol 29:341) used self-derived dendritic cells for exosome production. Targetingto the brain was achieved by engineering the dendritic cells to expressLamp2b, an exosomal membrane protein, fused to the neuron-specific RVGpeptide. Purified exosomes were loaded with exogenous RNA byelectroporation. Intravenously injected RVG-targeted exosomes deliveredGAPDH siRNA specifically to neurons, microglia, oligodendrocytes in thebrain, resulting in a specific gene knockdown. Pre-exposure to RVGexosomes did not attenuate knockdown, and non-specific uptake in othertissues was not observed. The therapeutic potential of exosome-mediatedsiRNA delivery was demonstrated by the strong mRNA (60%) and protein(62%) knockdown of BACE1, a therapeutic target in Alzheimer's disease.

To obtain a pool of immunologically inert exosomes, Alvarez-Erviti etal. harvested bone marrow from inbred C57BL/6 mice with a homogenousmajor histocompatibility complex (MHC) haplotype. As immature dendriticcells produce large quantities of exosomes devoid of T-cell activatorssuch as MHC-II and CD86, Alvarez-Erviti et al. selected for dendriticcells with granulocyte/macrophage-colony stimulating factor (GM-CSF) for7 d. Exosomes were purified from the culture supernatant the followingday using well-established ultracentrifugation protocols. The exosomesproduced were physically homogenous, with a size distribution peaking at80 nm in diameter as determined by particle tracking analysis (NTA) andelectron microscopy. Alvarez-Erviti et al. obtained 6-12 μg of exosomes(measured based on protein concentration) per 10⁶ cells. Next,Alvarez-Erviti et al. investigated the possibility of loading modifiedexosomes with exogenous cargoes using electroporation protocols adaptedfor nanoscale applications. As electroporation for membrane particles atthe nanometer scale is not well-characterized, nonspecific Cy5-labeledRNA was used for the empirical optimization of the electroporationprotocol. The amount of encapsulated RNA was assayed afterultracentrifugation and lysis of exosomes. Electroporation at 400 V and125 μF resulted in the greatest retention of RNA and was used for allsubsequent experiments. Alvarez-Erviti et al. administered 150 μg ofeach BACE1 siRNA encapsulated in 150 μg of RVG exosomes to normalC57BL/6 mice and compared the knockdown efficiency to four controls:untreated mice, mice injected with RVG exosomes only, mice injected withBACE1 siRNA complexed to an in vivo cationic liposome reagent and miceinjected with BACE1 siRNA complexed to RVG-9R, the RVG peptideconjugated to 9 D-arginines that electrostatically binds to the siRNA.Cortical tissue samples were analyzed 3 d after administration and asignificant protein knockdown (45%, P<0.05, versus 62%, P<0.01) in bothsiRNA-RVG-9R-treated and siRNARVG exosome-treated mice was observed,resulting from a significant decrease in BACE1 mRNA levels (66% [+ or −]15%, P<0.001 and 61% [+ or −] 13% respectively, P<0.01). Moreover,Applicants demonstrated a significant decrease (55%, P<0.05) in thetotal [beta]-amyloid 1-42 levels, a main component of the amyloidplaques in Alzheimer's pathology, in the RVG-exosome-treated animals.The decrease observed was greater than the β-amyloid 1-40 decreasedemonstrated in normal mice after intraventricular injection of BACE1inhibitors. Alvarez-Erviti et al. carried out 5′-rapid amplification ofcDNA ends (RACE) on BACE1 cleavage product, which provided evidence ofRNAi-mediated knockdown by the siRNA. Finally, Alvarez-Erviti et al.investigated whether RNA-RVG exosomes induced immune responses in vivoby assessing IL-6, IP-10, TNFα and IFN-α serum concentrations. Followingexosome treatment, nonsignificant changes in all cytokines wereregistered similar to siRNA-transfection reagent treatment in contrastto siRNA-RVG-9R, which potently stimulated IL-6 secretion, confirmingthe immunologically inert profile of the exosome treatment. Given thatexosomes encapsulate only 20% of siRNA, delivery with RVG-exosomeappears to be more efficient than RVG-9R delivery as comparable mRNAknockdown and greater protein knockdown was achieved with fivefold lesssiRNA without the corresponding level of immune stimulation. Thisexperiment demonstrated the therapeutic potential of RVG-exosometechnology, which is potentially suited for long-term silencing of genesrelated to neurodegenerative diseases. The exosome delivery system ofAlvarez-Erviti et al. may be applied to deliver the nucleicacid-targeting system of the present invention to therapeutic targets,especially neurodegenerative diseases. A dosage of about 100 to 1000 mgof nucleic acid-targeting system encapsulated in about 100 to 1000 mg ofRVG exosomes may be contemplated for the present invention.

El-Andaloussi et al. (Nature Protocols 7, 2112-2126 (2012)) providesexosomes derived from cultured cells harnessed for delivery of RNA invitro and in vivo. This protocol first describes the generation oftargeted exosomes through transfection of an expression vector,comprising an exosomal protein fused with a peptide ligand. Next,El-Andaloussi et al. explain how to purify and characterize exosomesfrom transfected cell supernatant. Next, El-Andaloussi et al. detailcrucial steps for loading RNA into exosomes. Finally, El-Andaloussi etal. outline how to use exosomes to efficiently deliver RNA in vitro andin vivo in mouse brain. Examples of anticipated results in whichexosome-mediated RNA delivery is evaluated by functional assays andimaging are also provided. The entire protocol takes ˜3 weeks. Deliveryor administration according to the invention may be performed usingexosomes produced from self-derived dendritic cells. From the hereinteachings, this can be employed in the practice of the invention

In another embodiment, the plasma exosomes of Wahlgren et al. (NucleicAcids Research, 2012, Vol. 40, No. 17 e130) are contemplated. Exosomesare nano-sized vesicles (30-90 nm in size) produced by many cell types,including dendritic cells (DC), B cells, T cells, mast cells, epithelialcells and tumor cells. These vesicles are formed by inward budding oflate endosomes and are then released to the extracellular environmentupon fusion with the plasma membrane. Because exosomes naturally carryRNA between cells, this property may be useful in gene therapy, and fromthis disclosure can be employed in the practice of the instantinvention. Exosomes from plasma can be prepared by centrifugation ofbuffy coat at 900 g for 20 min to isolate the plasma followed byharvesting cell supernatants, centrifuging at 300 g for 10 min toeliminate cells and at 16500 g for 30 min followed by filtration througha 0.22 mm filter. Exosomes are pelleted by ultracentrifugation at 120000g for 70 min. Chemical transfection of siRNA into exosomes is carriedout according to the manufacturer's instructions in RNAi Human/MouseStarter Kit (Quiagen, Hilden, Germany). siRNA is added to 100 ml PBS ata final concentration of 2 mmol/ml. After adding HiPerFect transfectionreagent, the mixture is incubated for 10 min at RT. In order to removethe excess of micelles, the exosomes are re-isolated usingaldehyde/sulfate latex beads. The chemical transfection of nucleicacid-targeting system into exosomes may be conducted similarly to siRNA.The exosomes may be co-cultured with monocytes and lymphocytes isolatedfrom the peripheral blood of healthy donors. Therefore, it may becontemplated that exosomes containing nucleic acid-targeting system maybe introduced to monocytes and lymphocytes of and autologouslyreintroduced into a human. Accordingly, delivery or administrationaccording to the invention may be performed using plasma exosomes.

Liposomes

Delivery or administration according to the invention can be performedwith liposomes. Liposomes are spherical vesicle structures composed of auni- or multilamellar lipid bilayer surrounding internal aqueouscompartments and a relatively impermeable outer lipophilic phospholipidbilayer. Liposomes have gained considerable attention as drug deliverycarriers because they are biocompatible, nontoxic, can deliver bothhydrophilic and lipophilic drug molecules, protect their cargo fromdegradation by plasma enzymes, and transport their load acrossbiological membranes and the blood brain barrier (BBB) (see, e.g., Spuchand Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12pages, 2011. doi:10.1155/2011/469679 for review). Liposomes can be madefrom several different types of lipids; however, phospholipids are mostcommonly used to generate liposomes as drug carriers. Although liposomeformation is spontaneous when a lipid film is mixed with an aqueoussolution, it can also be expedited by applying force in the form ofshaking by using a homogenizer, sonicator, or an extrusion apparatus(see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011,Article ID 469679, 12 pages, 2011. doi:10.1155/2011/469679 for review).

Several other additives may be added to liposomes in order to modifytheir structure and properties. For instance, either cholesterol orsphingomyelin may be added to the liposomal mixture in order to helpstabilize the liposomal structure and to prevent the leakage of theliposomal inner cargo. Further, liposomes are prepared from hydrogenatedegg phosphatidylcholine or egg phosphatidylcholine, cholesterol, anddicetyl phosphate, and their mean vesicle sizes were adjusted to about50 and 100 nm. (see, e.g., Spuch and Navarro, Journal of Drug Delivery,vol. 2011, Article ID 469679, 12 pages, 2011. doi:10.1155/2011/469679for review). A liposome formulation may be mainly comprised of naturalphospholipids and lipids such as1,2-distearoryl-sn-glycero-3-phosphatidyl choline (DSPC), sphingomyelin,egg phosphatidylcholines and monosialoganglioside. Since thisformulation is made up of phospholipids only, liposomal formulationshave encountered many challenges, one of the ones being the instabilityin plasma. Several attempts to overcome these challenges have been made,specifically in the manipulation of the lipid membrane. One of theseattempts focused on the manipulation of cholesterol. Addition ofcholesterol to conventional formulations reduces rapid release of theencapsulated bioactive compound into the plasma or1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE) increases thestability (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol.2011, Article ID 469679, 12 pages, 2011. doi:10.1155/2011/469679 forreview). In a particularly advantageous embodiment, Trojan Horseliposomes (also known as Molecular Trojan Horses) are desirable andprotocols may be found atcshprotocols.cshlp.org/content/2010/4/pdb.prot5407.long. These particlesallow delivery of a transgene to the entire brain after an intravascularinjection. Without being bound by limitation, it is believed thatneutral lipid particles with specific antibodies conjugated to surfaceallow crossing of the blood brain barrier via endocytosis. Applicantpostulates utilizing Trojan Horse Liposomes to deliver the CRISPR-Cascomplexes to the brain via an intravascular injection, which would allowwhole brain transgenic animals without the need for embryonicmanipulation. About 1-5 g of DNA or RNA may be contemplated for in vivoadministration in liposomes.

In another embodiment, the nucleic acid-targeting system or componentsthereof may be administered in liposomes, such as a stablenucleic-acid-lipid particle (SNALP) (see, e.g., Morrissey et al., NatureBiotechnology, Vol. 23, No. 8, August 2005). Daily intravenousinjections of about 1, 3 or 5 mg/kg/day of a specific nucleicacid-targeting system targeted in a SNALP are contemplated. The dailytreatment may be over about three days and then weekly for about fiveweeks. In another embodiment, a specific nucleic acid-targeting systemencapsulated SNALP) administered by intravenous injection to at doses ofabout 1 or 2.5 mg/kg are also contemplated (see, e.g., Zimmerman et al.,Nature Letters, Vol. 441, 4 May 2006). The SNALP formulation may containthe lipids 3-N-[(wmethoxypoly(ethylene glycol) 2000)carbamoyl]-1,2-dimyristyloxy-propylamine (PEG-C-DMA),1,2-dilinoleyloxy-N,N-dimethyl-3-aminopropane (DLinDMA),1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC) and cholesterol, in a2:40:10:48 molar percent ratio (see, e.g., Zimmerman et al., NatureLetters, Vol. 441, 4 May 2006). In another embodiment, stablenucleic-acid-lipid particles (SNALPs) have proven to be effectivedelivery molecules to highly vascularized HepG2-derived liver tumors butnot in poorly vascularized HCT-116 derived liver tumors (see, e.g., Li,Gene Therapy (2012) 19, 775-780). The SNALP liposomes may be prepared byformulating D-Lin-DMA and PEG-C-DMA with distearoylphosphatidylcholine(DSPC), Cholesterol and siRNA using a 25:1 lipid/siRNA ratio and a48/40/10/2 molar ratio of Cholesterol/D-Lin-DMA/DSPC/PEG-C-DMA. Theresulted SNALP liposomes are about 80-100 nm in size. In yet anotherembodiment, a SNALP may comprise synthetic cholesterol (Sigma-Aldrich,St Louis, Mo., USA), dipalmitoylphosphatidylcholine (Avanti PolarLipids, Alabaster, Ala., USA), 3-N-[(w-methoxy poly(ethyleneglycol)2000)carbamoyl]-1,2-dimyrestyloxypropylamine, and cationic1,2-dilinoleyloxy-3-N,Ndimethylaminopropane (see, e.g., Geisbert et al.,Lancet 2010; 375: 1896-905). A dosage of about 2 mg/kg total nucleicacid-targeting systemper dose administered as, for example, a bolusintravenous infusion may be contemplated. In yet another embodiment, aSNALP may comprise synthetic cholesterol (Sigma-Aldrich),1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC; Avanti Polar LipidsInc.), PEG-cDMA, and 1,2-dilinoleyloxy-3-(N;N-dimethyl)aminopropane(DLinDMA) (see, e.g., Judge, J. Clin. Invest. 119:661-673 (2009)).Formulations used for in vivo studies may comprise a final lipid/RNAmass ratio of about 9:1.

The safety profile of RNAi nanomedicines has been reviewed by Barros andGollob of Alnylam Pharmaceuticals (see, e.g., Advanced Drug DeliveryReviews 64 (2012) 1730-1737). The stable nucleic acid lipid particle(SNALP) is comprised of four different lipids an ionizable lipid(DLinDMA) that is cationic at low pH, a neutral helper lipid,cholesterol, and a diffusible polyethylene glycol (PEG)-lipid. Theparticle is approximately 80 nm in diameter and is charge-neutral atphysiologic pH. During formulation, the ionizable lipid serves tocondense lipid with the anionic RNA during particle formation. Whenpositively charged under increasingly acidic endosomal conditions, theionizable lipid also mediates the fusion of SNALP with the endosomalmembrane enabling release of RNA into the cytoplasm. The PEG-lipidstabilizes the particle and reduces aggregation during formulation, andsubsequently provides a neutral hydrophilic exterior that improvespharmacokinetic properties. To date, two clinical programs have beeninitiated using SNALP formulations with RNA. Tekmira Pharmaceuticalsrecently completed a phase I single-dose study of SNALP-ApoB in adultvolunteers with elevated LDL cholesterol. ApoB is predominantlyexpressed in the liver and jejunum and is essential for the assembly andsecretion of VLDL and LDL. Seventeen subjects received a single dose ofSNALP-ApoB (dose escalation across 7 dose levels). There was no evidenceof liver toxicity (anticipated as the potential dose-limiting toxicitybased on preclinical studies). One (of two) subjects at the highest doseexperienced flu-like symptoms consistent with immune system stimulation,and the decision was made to conclude the trial. Alnylam Pharmaceuticalshas similarly advanced ALN-TTR01, which employs the SNALP technologydescribed above and targets hepatocyte production of both mutant andwild-type TTR to treat TTR amyloidosis (ATTR). Three ATTR syndromes havebeen described: familial amyloidotic polyneuropathy (FAP) and familialamyloidotic cardiomyopathy (FAC)—both caused by autosomal dominantmutations in TTR; and senile systemic amyloidosis (SSA) cause bywildtype TTR. A placebo-controlled, single dose-escalation phase I trialof ALN-TTR01 was recently completed in patients with ATTR. ALN-TTR01 wasadministered as a 15-minute IV infusion to 31 patients (23 with studydrug and 8 with placebo) within a dose range of 0.01 to 1.0 mg/kg (basedon siRNA). Treatment was well tolerated with no significant increases inliver function tests. Infusion-related reactions were noted in 3 of 23patients at >0.4 mg/kg; all responded to slowing of the infusion rateand all continued on study. Minimal and transient elevations of serumcytokines IL-6, IP-10 and IL-lra were noted in two patients at thehighest dose of 1 mg/kg (as anticipated from preclinical and NHPstudies). Lowering of serum TTR, the expected pharmacodynamics effect ofALN-TTR01, was observed at 1 mg/kg.

In yet another embodiment, a SNALP may be made by solubilizing acationic lipid, DSPC, cholesterol and PEG-lipid e.g., in ethanol, e.g.,at a molar ratio of 40:10:40:10, respectively (see, Semple et al.,Nature Niotechnology, Volume 28 Number 2 Feb. 2010, pp. 172-177). Thelipid mixture was added to an aqueous buffer (50 mM citrate, pH 4) withmixing to a final ethanol and lipid concentration of 30% (vol/vol) and6.1 mg/ml, respectively, and allowed to equilibrate at 22° C. for 2 minbefore extrusion. The hydrated lipids were extruded through two stacked80 nm pore-sized filters (Nuclepore) at 22° C. using a Lipex Extruder(Northern Lipids) until a vesicle diameter of 70-90 nm, as determined bydynamic light scattering analysis, was obtained. This generally required1-3 passes. The siRNA (solubilized in a 50 mM citrate, pH 4 aqueoussolution containing 30% ethanol) was added to the pre-equilibrated (35°C.) vesicles at a rate of ˜5 ml/min with mixing. After a final targetsiRNA/lipid ratio of 0.06 (wt/wt) was reached, the mixture was incubatedfor a further 30 min at 35° C. to allow vesicle reorganization andencapsulation of the siRNA. The ethanol was then removed and theexternal buffer replaced with PBS (155 mM NaCl, 3 mM Na2HPO4, 1 mMKH2PO4, pH 7.5) by either dialysis or tangential flow diafiltration.siRNA were encapsulated in SNALP using a controlled step-wise dilutionmethod process. The lipid constituents of KC2-SNALP were DLin-KC2-DMA(cationic lipid), dipalmitoylphosphatidylcholine (DPPC; Avanti PolarLipids), synthetic cholesterol (Sigma) and PEG-C-DMA used at a molarratio of 57.1:7.1:34.3:1.4. Upon formation of the loaded particles,SNALP were dialyzed against PBS and filter sterilized through a 0.2 μmfilter before use. Mean particle sizes were 75-85 nm and 90-95% of thesiRNA was encapsulated within the lipid particles. The final siRNA/lipidratio in formulations used for in vivo testing was ˜0.15 (wt/wt).LNP-siRNA systems containing Factor VII siRNA were diluted to theappropriate concentrations in sterile PBS immediately before use and theformulations were administered intravenously through the lateral tailvein in a total volume of 10 ml/kg. This method and these deliverysystems may be extrapolated to the nucleic acid-targeting system of thepresent invention.

Other Lipids

Other cationic lipids, such as amino lipid2,2-dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane (DLin-KC2-DMA) maybe utilized to encapsulate nucleic acid-targeting system or componentsthereof or nucleic acid molecule(s) coding therefor e.g., similar toSiRNA (see, e.g., Jayaraman, Angew. Chem. Int. Ed. 2012, 51, 8529-8533),and hence may be employed in the practice of the invention. A preformedvesicle with the following lipid composition may be contemplated: aminolipid, distearoylphosphatidylcholine (DSPC), cholesterol and(R)-2,3-bis(octadecyloxy) propyl-1-(methoxy poly(ethyleneglycol)2000)propylcarbamate (PEG-lipid) in the molar ratio 40/10/40/10,respectively, and a FVII siRNA/total lipid ratio of approximately 0.05(w/w). To ensure a narrow particle size distribution in the range of70-90 nm and a low polydispersity index of 0.11+0.04 (n=56), theparticles may be extruded up to three times through 80 nm membranesprior to adding the guide RNA. Particles containing the highly potentamino lipid 16 may be used, in which the molar ratio of the four lipidcomponents 16, DSPC, cholesterol and PEG-lipid (50/10/38.5/1.5) whichmay be further optimized to enhance in vivo activity.

Michael S D Kormann et al. (“Expression of therapeutic proteins afterdelivery of chemically modified mRNA in mice: Nature Biotechnology,Volume:29, Pages: 154-157 (2011)) describes the use of lipid envelopesto deliver RNA. Use of lipid envelopes is also preferred in the presentinvention.

In another embodiment, lipids may be formulated with the RNA-targetingsystem (CRISPR-Cas13 complex, i.e., the Cas13 complexed with crRNA) ofthe present invention or component(s) thereof or nucleic acidmolecule(s) coding therefor to form lipid nanoparticles (LNPs). Lipidsinclude, but are not limited to, DLin-KC2-DMA4, C12-200 and colipidsdisteroylphosphatidyl choline, cholesterol, and PEG-DMG may beformulated with RNA-targeting system instead of siRNA (see, e.g.,Novobrantseva, Molecular Therapy—Nucleic Acids (2012) 1, e4;doi:10.1038/mtna.2011.3) using a spontaneous vesicle formationprocedure. The component molar ratio may be about 50/10/38.5/1.5(DLin-KC2-DMA or C12-200/disteroylphosphatidylcholine/cholesterol/PEG-DMG). The final lipid: siRNA weight ratio may be˜12:1 and 9:1 in the case of DLin-KC2-DMA and C12-200 lipid particles(LNPs), respectively. The formulations may have mean particle diametersof ˜80 nm with >90% entrapment efficiency. A 3 mg/kg dose may becontemplated. Tekmira has a portfolio of approximately 95 patentfamilies, in the U.S. and abroad, that are directed to various aspectsof LNPs and LNP formulations (see, e.g., U.S. Pat. Nos. 7,982,027;7,799,565; 8,058,069; 8,283,333; 7,901,708; 7,745,651; 7,803,397;8,101,741; 8,188,263; 7,915,399; 8,236,943 and 7,838,658 and EuropeanPat. Nos 1766035; 1519714; 1781593 and 1664316), all of which may beused and/or adapted to the present invention.

The RNA-targeting system or components thereof or nucleic acidmolecule(s) coding therefor may be delivered encapsulated in PLGAMicrospheres such as that further described in US published applications20130252281 and 20130245107 and 20130244279 (assigned to ModernaTherapeutics) which relate to aspects of formulation of compositionscomprising modified nucleic acid molecules which may encode a protein, aprotein precursor, or a partially or fully processed form of the proteinor a protein precursor. The formulation may have a molar ratio50:10:38.5:1.5-3.0 (cationic lipid:fusogenic lipid:cholesterol:PEGlipid). The PEG lipid may be selected from, but is not limited toPEG-c-DOMG, PEG-DMG. The fusogenic lipid may be DSPC. See also, Schrumet al., Delivery and Formulation of Engineered Nucleic Acids, USpublished application 20120251618.

Nanomerics' technology addresses bioavailability challenges for a broadrange of therapeutics, including low molecular weight hydrophobic drugs,peptides, and nucleic acid based therapeutics (plasmid, siRNA, miRNA).Specific administration routes for which the technology has demonstratedclear advantages include the oral route, transport across theblood-brain-barrier, delivery to solid tumours, as well as to the eye.See, e.g., Mazza et al., 2013, ACS Nano. 2013 Feb. 26; 7(2):1016-26;Uchegbu and Siew, 2013, J Pharm Sci. 102(2):305-10 and Lalatsa et al.,2012, J Control Release. 2012 Jul. 20; 161(2):523-36.

US Patent Publication No. 20050019923 describes cationic dendrimers fordelivering bioactive molecules, such as polynucleotide molecules,peptides and polypeptides and/or pharmaceutical agents, to a mammalianbody. The dendrimers are suitable for targeting the delivery of thebioactive molecules to, for example, the liver, spleen, lung, kidney orheart (or even the brain). Dendrimers are synthetic 3-dimensionalmacromolecules that are prepared in a step-wise fashion from simplebranched monomer units, the nature and functionality of which can beeasily controlled and varied. Dendrimers are synthesized from therepeated addition of building blocks to a multifunctional core(divergent approach to synthesis), or towards a multifunctional core(convergent approach to synthesis) and each addition of a 3-dimensionalshell of building blocks leads to the formation of a higher generationof the dendrimers. Polypropylenimine dendrimers start from adiaminobutane core to which is added twice the number of amino groups bya double Michael addition of acrylonitrile to the primary aminesfollowed by the hydrogenation of the nitriles. This results in adoubling of the amino groups. Polypropylenimine dendrimers contain 100%protonable nitrogens and up to 64 terminal amino groups (generation 5,DAB 64). Protonable groups are usually amine groups which are able toaccept protons at neutral pH. The use of dendrimers as gene deliveryagents has largely focused on the use of the polyamidoamine. andphosphorous containing compounds with a mixture of amine/amide orN—P(O₂)S as the conjugating units respectively with no work beingreported on the use of the lower generation polypropylenimine dendrimersfor gene delivery. Polypropylenimine dendrimers have also been studiedas pH sensitive controlled release systems for drug delivery and fortheir encapsulation of guest molecules when chemically modified byperipheral amino acid groups. The cytotoxicity and interaction ofpolypropylenimine dendrimers with DNA as well as the transfectionefficacy of DAB 64 has also been studied. US Patent Publication No.20050019923 is based upon the observation that, contrary to earlierreports, cationic dendrimers, such as polypropylenimine dendrimers,display suitable properties, such as specific targeting and lowtoxicity, for use in the targeted delivery of bioactive molecules, suchas genetic material. In addition, derivatives of the cationic dendrimeralso display suitable properties for the targeted delivery of bioactivemolecules. See also, Bioactive Polymers, US published application20080267903, which discloses “Various polymers, including cationicpolyamine polymers and dendrimeric polymers, are shown to possessanti-proliferative activity, and may therefore be useful for treatmentof disorders characterised by undesirable cellular proliferation such asneoplasms and tumours, inflammatory disorders (including autoimmunedisorders), psoriasis and atherosclerosis. The polymers may be usedalone as active agents, or as delivery vehicles for other therapeuticagents, such as drug molecules or nucleic acids for gene therapy. Insuch cases, the polymers' own intrinsic anti-tumour activity maycomplement the activity of the agent to be delivered.” The disclosuresof these patent publications may be employed in conjunction with hereinteachings for delivery of nucleic acid-targetingsystem(s) orcomponent(s) thereof or nucleic acid molecule(s) coding therefor.

Supercharged Proteins

Supercharged proteins are a class of engineered or naturally occurringproteins with unusually high positive or negative net theoretical chargeand may be employed in delivery of nucleic acid-targetingsystem(s) orcomponent(s) thereof or nucleic acid molecule(s) coding therefor. Bothsupernegatively and superpositively charged proteins exhibit aremarkable ability to withstand thermally or chemically inducedaggregation. Superpositively charged proteins are also able to penetratemammalian cells. Associating cargo with these proteins, such as plasmidDNA, RNA, or other proteins, can enable the functional delivery of thesemacromolecules into mammalian cells both in vitro and in vivo. DavidLiu's lab reported the creation and characterization of superchargedproteins in 2007 (Lawrence et al., 2007, Journal of the AmericanChemical Society 129, 10110-10112).

The nonviral delivery of RNA and plasmid DNA into mammalian cells arevaluable both for research and therapeutic applications (Akinc et al.,2010, Nat. Biotech. 26, 561-569). Purified +36 GFP protein (or othersuperpositively charged protein) is mixed with RNAs in the appropriateserum-free media and allowed to complex prior addition to cells.Inclusion of serum at this stage inhibits formation of the superchargedprotein-RNA complexes and reduces the effectiveness of the treatment.The following protocol has been found to be effective for a variety ofcell lines (McNaughton et al., 2009, Proc. Natl. Acad. Sci. USA 106,6111-6116). However, pilot experiments varying the dose of protein andRNA should be performed to optimize the procedure for specific celllines. (1) One day before treatment, plate 1×10⁵ cells per well in a48-well plate. (2) On the day of treatment, dilute purified +36 GFPprotein in serumfree media to a final concentration 200 nM. Add RNA to afinal concentration of 50 nM. Vortex to mix and incubate at roomtemperature for 10 min. (3) During incubation, aspirate media from cellsand wash once with PBS. (4) Following incubation of +36 GFP and RNA, addthe protein-RNA complexes to cells. (5) Incubate cells with complexes at37° C. for 4h. (6) Following incubation, aspirate the media and washthree times with 20 U/mL heparin PBS. Incubate cells withserum-containing media for a further 48h or longer depending upon theassay for activity. (7) Analyze cells by immunoblot, qPCR, phenotypicassay, or other appropriate method.

+36 GFP was found to be an effective plasmid delivery reagent in a rangeof cells. See also, e.g., McNaughton et al., Proc. Natl. Acad. Sci. USA106, 6111-6116 (2009); Cronican et al., ACS Chemical Biology 5, 747-752(2010); Cronican et al., Chemistry & Biology 18, 833-838 (2011);Thompson et al., Methods in Enzymology 503, 293-319 (2012); Thompson, D.B., et al., Chemistry & Biology 19 (7), 831-843 (2012). The methods ofthe super charged proteins may be used and/or adapted for delivery ofthe RNA-targeting system(s) or component(s) thereof or nucleic acidmolecule(s) coding therefor of the invention.

Cell Penetrating Peptides (CPPS)

In yet another embodiment, cell penetrating peptides (CPPs) arecontemplated for the delivery of the CRISPR Cas system. CPPs are shortpeptides that facilitate cellular uptake of various molecular cargo(from nanosize particles to small chemical molecules and large fragmentsof DNA). The term “cargo” as used herein includes but is not limited tothe group consisting of therapeutic agents, diagnostic probes, peptides,nucleic acids, antisense oligonucleotides, plasmids, proteins, particlesincluding nanoparticles, liposomes, chromophores, small molecules andradioactive materials. In aspects of the invention, the cargo may alsocomprise any component of the CRISPR Cas system or the entire functionalCRISPR Cas system. Aspects of the present invention further providemethods for delivering a desired cargo into a subject comprising: (a)preparing a complex comprising the cell penetrating peptide of thepresent invention and a desired cargo, and (b) orally, intraarticularly,intraperitoneally, intrathecally, intraarterially, intranasally,intraparenchymal, subcutaneously, intramuscularly, intravenously,dermally, intrarectally, or topically administering the complex to asubject. The cargo is associated with the peptides either throughchemical linkage via covalent bonds or through non-covalentinteractions. The function of the CPPs are to deliver the cargo intocells, a process that commonly occurs through endocytosis with the cargodelivered to the endosomes of living mammalian cells. Cell-penetratingpeptides are of different sizes, amino acid sequences, and charges butall CPPs have one distinct characteristic, which is the ability totranslocate the plasma membrane and facilitate the delivery of variousmolecular cargoes to the cytoplasm or an organelle. CPP translocationmay be classified into three main entry mechanisms: direct penetrationin the membrane, endocytosis-mediated entry, and translocation throughthe formation of a transitory structure. CPPs have found numerousapplications in medicine as drug delivery agents in the treatment ofdifferent diseases including cancer and virus inhibitors, as well ascontrast agents for cell labeling. Examples of the latter include actingas a carrier for GFP, MRI contrast agents, or quantum dots. CPPs holdgreat potential as in vitro and in vivo delivery vectors for use inresearch and medicine. CPPs typically have an amino acid compositionthat either contains a high relative abundance of positively chargedamino acids such as lysine or arginine or has sequences that contain analternating pattern of polar/charged amino acids and non-polar,hydrophobic amino acids. These two types of structures are referred toas polycationic or amphipathic, respectively. A third class of CPPs arethe hydrophobic peptides, containing only apolar residues, with low netcharge or have hydrophobic amino acid groups that are crucial forcellular uptake. One of the initial CPPs discovered was thetrans-activating transcriptional activator (Tat) from HumanImmunodeficiency Virus 1 (HIV-1) which was found to be efficiently takenup from the surrounding media by numerous cell types in culture. Sincethen, the number of known CPPs has expanded considerably and smallmolecule synthetic analogues with more effective protein transductionproperties have been generated. CPPs include but are not limited toPenetratin, Tat (48-60), Transportan, and (R-AhX-R4)(Ahx=aminohexanoyl).

U.S. Pat. No. 8,372,951, provides a CPP derived from eosinophil cationicprotein (ECP) which exhibits highly cell-penetrating efficiency and lowtoxicity. Aspects of delivering the CPP with its cargo into a vertebratesubject are also provided. Further aspects of CPPs and their deliveryare described in U.S. Pat. Nos. 8,575,305; 8,614,194 and 8,044,019. CPPscan be used to deliver the CRISPR-Cas system or components thereof. ThatCPPs can be employed to deliver the CRISPR-Cas system or componentsthereof is also provided in the manuscript “Gene disruption bycell-penetrating peptide-mediated delivery of Cas9 protein and guideRNA”, by Suresh Ramakrishna, Abu-Bonsrah Kwaku Dad, Jagadish Beloor, etal. Genome Res. 2014 Apr. 2. [Epub ahead of print], incorporated byreference in its entirety, wherein it is demonstrated that treatmentwith CPP-conjugated recombinant Cas9 protein and CPP-complexed guideRNAs lead to endogenous gene disruptions in human cell lines. In thepaper the Cas9 protein was conjugated to CPP via a thioether bond,whereas the guide RNA was complexed with CPP, forming condensed,positively charged particles. It was shown that simultaneous andsequential treatment of human cells, including embryonic stem cells,dermal fibroblasts, HEK293T cells, HeLa cells, and embryonic carcinomacells, with the modified Cas9 and guide RNA led to efficient genedisruptions with reduced off-target mutations relative to plasmidtransfections. CPP delivery can be used in the practice of theinvention.

Implantable Devices

In another embodiment, implantable devices are also contemplated fordelivery of the RNA-targeting system or component(s) thereof or nucleicacid molecule(s) coding therefor. For example, US Patent Publication20110195123 discloses an implantable medical device which elutes a druglocally and in prolonged period is provided, including several types ofsuch a device, the treatment modes of implementation and methods ofimplantation. The device comprising of polymeric substrate, such as amatrix for example, that is used as the device body, and drugs, and insome cases additional scaffolding materials, such as metals oradditional polymers, and materials to enhance visibility and imaging. Animplantable delivery device can be advantageous in providing releaselocally and over a prolonged period, where drug is released directly tothe extracellular matrix (ECM) of the diseased area such as tumor,inflammation, degeneration or for symptomatic objectives, or to injuredsmooth muscle cells, or for prevention. One kind of drug is RNA, asdisclosed above, and this system may be used/ and or adapted to thenucleic acid-targeting system of the present invention. The modes ofimplantation in some embodiments are existing implantation proceduresthat are developed and used today for other treatments, includingbrachytherapy and needle biopsy. In such cases the dimensions of the newimplant described in this invention are similar to the original implant.Typically a few devices are implanted during the same treatmentprocedure. US Patent Publication 20110195123, provides a drug deliveryimplantable or insertable system, including systems applicable to acavity such as the abdominal cavity and/or any other type ofadministration in which the drug delivery system is not anchored orattached, comprising a biostable and/or degradable and/or bioabsorbablepolymeric substrate, which may for example optionally be a matrix. Itshould be noted that the term “insertion” also includes implantation.The drug delivery system is preferably implemented as a “Loder” asdescribed in US Patent Publication 20110195123. The polymer or pluralityof polymers are biocompatible, incorporating an agent and/or pluralityof agents, enabling the release of agent at a controlled rate, whereinthe total volume of the polymeric substrate, such as a matrix forexample, in some embodiments is optionally and preferably no greaterthan a maximum volume that permits a therapeutic level of the agent tobe reached. As a non-limiting example, such a volume is preferablywithin the range of 0.1 m³ to 1000 mm³, as required by the volume forthe agent load. The Loder may optionally be larger, for example whenincorporated with a device whose size is determined by functionality,for example and without limitation, a knee joint, an intra-uterine orcervical ring and the like. The drug delivery system (for delivering thecomposition) is designed in some embodiments to preferably employdegradable polymers, wherein the main release mechanism is bulk erosion;or in some embodiments, non-degradable, or slowly degraded polymers areused, wherein the main release mechanism is diffusion rather than bulkerosion, so that the outer part functions as membrane, and its internalpart functions as a drug reservoir, which practically is not affected bythe surroundings for an extended period (for example from about a weekto about a few months). Combinations of different polymers withdifferent release mechanisms may also optionally be used. Theconcentration gradient at the surface is preferably maintainedeffectively constant during a significant period of the total drugreleasing period, and therefore the diffusion rate is effectivelyconstant (termed “zero mode” diffusion). By the term “constant” it ismeant a diffusion rate that is preferably maintained above the lowerthreshold of therapeutic effectiveness, but which may still optionallyfeature an initial burst and/or may fluctuate, for example increasingand decreasing to a certain degree. The diffusion rate is preferably somaintained for a prolonged period, and it can be considered constant toa certain level to optimize the therapeutically effective period, forexample the effective silencing period. The drug delivery systemoptionally and preferably is designed to shield the nucleotide basedtherapeutic agent from degradation, whether chemical in nature or due toattack from enzymes and other factors in the body of the subject. Thedrug delivery system of US Patent Publication 20110195123 is optionallyassociated with sensing and/or activation appliances that are operatedat and/or after implantation of the device, by non and/or minimallyinvasive methods of activation and/or acceleration/deceleration, forexample optionally including but not limited to thermal heating andcooling, laser beams, and ultrasonic, including focused ultrasoundand/or RF (radiofrequency) methods or devices. According to someembodiments of US Patent Publication 20110195123, the site for localdelivery may optionally include target sites characterized by highabnormal proliferation of cells, and suppressed apoptosis, includingtumors, active and or chronic inflammation and infection includingautoimmune diseases states, degenerating tissue including muscle andnervous tissue, chronic pain, degenerative sites, and location of bonefractures and other wound locations for enhancement of regeneration oftissue, and injured cardiac, smooth and striated muscle. The site forimplantation of the composition, or target site, preferably features aradius, area and/or volume that is sufficiently small for targeted localdelivery. For example, the target site optionally has a diameter in arange of from about 0.1 mm to about 5 cm. The location of the targetsite is preferably selected for maximum therapeutic efficacy. Forexample, the composition of the drug delivery system (optionally with adevice for implantation as described above) is optionally and preferablyimplanted within or in the proximity of a tumor environment, or theblood supply associated thereof. For example the composition (optionallywith the device) is optionally implanted within or in the proximity topancreas, prostate, breast, liver, via the nipple, within the vascularsystem and so forth. The target location is optionally selected from thegroup comprising, consisting essentially of, or consisting of (asnon-limiting examples only, as optionally any site within the body maybe suitable for implanting a Loder): 1. brain at degenerative sites likein Parkinson or Alzheimer disease at the basal ganglia, white and graymatter; 2. spine as in the case of amyotrophic lateral sclerosis (ALS);3. uterine cervix to prevent HPV infection; 4. active and chronicinflammatory joints; 5. dermis as in the case of psoriasis; 6.sympathetic and sensoric nervous sites for analgesic effect; 7. Intraosseous implantation; 8. acute and chronic infection sites; 9. Intravaginal; 10. Inner ear-auditory system, labyrinth of the inner ear,vestibular system; 11. Intra tracheal; 12. Intra-cardiac; coronary,epicardiac; 13. urinary bladder; 14. biliary system; 15. parenchymaltissue including and not limited to the kidney, liver, spleen; 16. lymphnodes; 17. salivary glands; 18. dental gums; 19. Intra-articular (intojoints); 20. Intra-ocular; 21. Brain tissue; 22. Brain ventricles; 23.Cavities, including abdominal cavity (for example but withoutlimitation, for ovary cancer); 24. Intra esophageal and 25. Intrarectal.

Optionally insertion of the system (for example a device containing thecomposition) is associated with injection of material to the ECM at thetarget site and the vicinity of that site to affect local pH and/ortemperature and/or other biological factors affecting the diffusion ofthe drug and/or drug kinetics in the ECM, of the target site and thevicinity of such a site. Optionally, according to some embodiments, therelease of said agent could be associated with sensing and/or activationappliances that are operated prior and/or at and/or after insertion, bynon and/or minimally invasive and/or else methods of activation and/oracceleration/deceleration, including laser beam, radiation, thermalheating and cooling, and ultrasonic, including focused ultrasound and/orRF (radiofrequency) methods or devices, and chemical activators.

According to embodiments of US Patent Publication 20110195123 that canbe used in the practice of the invention, the drug preferably comprisesa RNA, for example for localized cancer cases in breast, pancreas,brain, kidney, bladder, lung, and prostate as described below. Althoughexemplified with RNAi, many drugs are applicable to be encapsulated inLoder, and can be used in association with this invention, as long assuch drugs can be encapsulated with the Loder substrate, such as amatrix for example, and this system may be used and/or adapted todeliver the nucleic acid-targeting system of the present invention. Asanother example of a specific application, neuro and musculardegenerative diseases develop due to abnormal gene expression. Localdelivery of RNAs may have therapeutic properties for interfering withsuch abnormal gene expression. Local delivery of anti-apoptotic,anti-inflammatory and anti-degenerative drugs including small drugs andmacromolecules may also optionally be therapeutic. In such cases theLoder is applied for prolonged release at constant rate and/or through adedicated device that is implanted separately.

All of this may be used and/or adapted to the RNA-targeting system ofthe present invention. Implantable device technology herein discussedcan be employed with herein teachings and hence by this disclosure andthe knowledge in the art, CRISPR-Cas13 system or complex or componentsthereof or nucleic acid molecules thereof or encoding or providingcomponents may be delivered via an implantable device.

Polymer-Based Particles

The systems and compositions herein may be delivered using polymer-basedparticles (e.g., nanoparticles). In some embodiments, the polymer-basedparticles may mimic a viral mechanism of membrane fusion. Thepolymer-based particles may be a synthetic copy of Influenza virusmachinery and form transfection complexes with various types of nucleicacids ((siRNA, miRNA, plasmid DNA or shRNA, mRNA) that cells take up viathe endocytosis pathway, a process that involves the formation of anacidic compartment. The low pH in late endosomes acts as a chemicalswitch that renders the particle surface hydrophobic and facilitatesmembrane crossing. Once into the cytosol, the particle releases itspayload for cellular action. This Active Endosome Escape technology issafe and maximizes transfection efficiency as it is using a naturaluptake pathway. In some embodiments, the polymer-based particles maycomprise alkylated and carboxyalkylated branched polyethylenimine. Insome examples, the polymer-based particles are VIROMER, e.g., VIROMERRNAi, VIROMER RED, VIROMER mRNA, VIROMER CRISPR. Example methods ofdelivering the systems and compositions herein include those describedin Bawage S S et al., Synthetic mRNA expressed Cas13a mitigates RNAvirus infections, www.biorxiv.org/content/10. 1101/370460v1. full doi:doi.org/10. 1101/370460, Viromer® RED, a powerful tool for transfectionof keratinocytes. doi: 10.13140/RG.2.2.16993.61281, Viromer®Transfection—Factbook 2018: technology, product overview, users' data.,doi:10.13140/RG.2.2.23912.16642.

Vectors

In certain aspects the invention involves vectors, e.g. for deliveringor introducing in a cell CRISPR-Cas and/or RNA capable of guidingCRISPR-Cas to a target locus (i.e. guide RNA), but also for propagatingthese components (e.g. in prokaryotic cells). A used herein, a “vector”is a tool that allows or facilitates the transfer of an entity from oneenvironment to another. It is a replicon, such as a plasmid, phage, orcosmid, into which another DNA segment may be inserted so as to bringabout the replication of the inserted segment. Generally, a vector iscapable of replication when associated with the proper control elements.In general, the term “vector” refers to a nucleic acid molecule capableof transporting another nucleic acid to which it has been linked.Vectors include, but are not limited to, nucleic acid molecules that aresingle-stranded, double-stranded, or partially double-stranded; nucleicacid molecules that comprise one or more free ends, no free ends (e.g.circular); nucleic acid molecules that comprise DNA, RNA, or both; andother varieties of polynucleotides known in the art. One type of vectoris a “plasmid,” which refers to a circular double stranded DNA loop intowhich additional DNA segments can be inserted, such as by standardmolecular cloning techniques. Another type of vector is a viral vector,wherein virally-derived DNA or RNA sequences are present in the vectorfor packaging into a virus (e.g. retroviruses, replication defectiveretroviruses, adenoviruses, replication defective adenoviruses, andadeno-associated viruses (AAVs)). Viral vectors also includepolynucleotides carried by a virus for transfection into a host cell.Certain vectors are capable of autonomous replication in a host cellinto which they are introduced (e.g. bacterial vectors having abacterial origin of replication and episomal mammalian vectors). Othervectors (e.g., non-episomal mammalian vectors) are integrated into thegenome of a host cell upon introduction into the host cell, and therebyare replicated along with the host genome. Moreover, certain vectors arecapable of directing the expression of genes to which they areoperatively-linked. Such vectors are referred to herein as “expressionvectors.” Common expression vectors of utility in recombinant DNAtechniques are often in the form of plasmids.

Recombinant expression vectors can comprise a nucleic acid of theinvention in a form suitable for expression of the nucleic acid in ahost cell, which means that the recombinant expression vectors includeone or more regulatory elements, which may be selected on the basis ofthe host cells to be used for expression, that is operatively-linked tothe nucleic acid sequence to be expressed. Within a recombinantexpression vector, “operably linked” is intended to mean that thenucleotide sequence of interest is linked to the regulatory element(s)in a manner that allows for expression of the nucleotide sequence (e.g.in an in vitro transcription/translation system or in a host cell whenthe vector is introduced into the host cell). With regards torecombination and cloning methods, mention is made of U.S. patentapplication Ser. No. 10/815,730, published Sep. 2, 2004 as US2004-0171156 A1, the contents of which are herein incorporated byreference in their entirety.

The vector(s) can include the regulatory element(s), e.g., promoter(s).The vector(s) can comprise CRISPR-Cas encoding sequence(s), and/or asingle, but possibly also can comprise at least 2, 3 or 8 or 16 or 32 or48 or 50 guide RNA(s) (e.g., crRNAs) encoding sequences, such as 1-2,1-3, 1-4 1-5, 3-6, 3-7, 3-8, 3-9, 3-10, 3-8, 3-16, 3-30, 3-32, 3-48,3-50 RNA(s) (e.g., crRNAs). In a single vector there can be a promoterfor each RNA (e.g., crRNA(s)), advantageously when there are up to about16 RNA(s) (e.g., crRNA(s)s); and, when a single vector provides for morethan 16 RNA(s) (e.g., crRNA(s)s), one or more promoter(s) can driveexpression of more than one of the RNA(s) (e.g., crRNA(s)s), e.g., whenthere are 32 RNA(s) (e.g., sgRNAs or crRNA(s)), each promoter can driveexpression of two RNA(s) (e.g., sgRNAs or crRNA(s)), and when there are48 RNA(s) (e.g., sgRNAs or crRNA(s)), each promoter can drive expressionof three RNA(s) (e.g., sgRNAs or crRNA(s)). By simple arithmetic andwell established cloning protocols and the teachings in this disclosureone skilled in the art can readily practice the invention as to theRNA(s), e.g., sgRNA(s) or crRNA(s) for a suitable exemplary vector suchas AAV, and a suitable promoter such as the U6 promoter, e.g., U6-sgRNAsor -crRNA(s). For example, the packaging limit of AAV is ˜4.7 kb. Theskilled person can readily fit about 12-16, e.g., 13 U6-sgRNA orcrRNA(s) cassettes in a single vector. This can be assembled by anysuitable means, such as a golden gate strategy used for TALE assembly(www.genome-engineering.org/taleffectors/). The skilled person can alsouse a tandem guide strategy to increase the number of U6-sgRNAs or-crRNA(s) by approximately 1.5 times, e.g., to increase from 12-16,e.g., 13 to approximately 18-24, e.g., about 19 U6-sgRNAs or -crRNA(s).Therefore, one skilled in the art can readily reach approximately 18-24,e.g., about 19 promoter-RNAs, e.g., U6-sgRNAs or -crRNA(s) in a singlevector, e.g., an AAV vector. A further means for increasing the numberof promoters and RNAs, e.g., sgRNA(s) or crRNA(s) in a vector is to usea single promoter (e.g., U6) to express an array of RNAs, e.g., sgRNAsor crRNA(s) separated by cleavable sequences. And an even further meansfor increasing the number of promoter-RNAs, e.g., sgRNAs or crRNA(s) ina vector, is to express an array of promoter-RNAs, e.g., sgRNAs orcrRNA(s) separated by cleavable sequences in the intron of a codingsequence or gene; and, in this instance it is advantageous to use apolymerase II promoter, which can have increased expression and enablethe transcription of long RNA in a tissue specific manner. (see, e.g.,nar.oxfordjournals.org/content/34/7/e53.short, www.naturecom/mt/journal/v16/n⁹/abs/mt2008144a.html). In an advantageousembodiment, AAV may package U6 tandem sgRNA targeting up to about 50genes. Accordingly, from the knowledge in the art and the teachings inthis disclosure the skilled person can readily make and use vector(s),e.g., a single vector, expressing multiple RNAs or guides or sgRNAs orcrRNA(s) under the control or operatively or functionally linked to oneor more promoters—especially as to the numbers of RNAs or guides orsgRNAs or crRNA(s) discussed herein, without any undue experimentation.

Kits

In one aspect, the invention provides kits containing any one or more ofthe elements disclosed in the above methods and compositions. In someembodiments, the kit comprises a vector system as taught herein or oneor more of the components of the CRISPR/Cas system or complex as taughtherein, such as crRNAs and/or CRISPR-Cas effector protein or CRISPR-Caseffector protein encoding mRNA, and instructions for using the kit.Elements may be provide individually or in combinations, and may beprovided in any suitable container, such as a vial, a bottle, or a tube.In some embodiments, the kit includes instructions in one or morelanguages, for example in more than one language. The instructions maybe specific to the applications and methods described herein. In someembodiments, a kit comprises one or more reagents for use in a processutilizing one or more of the elements described herein. Reagents may beprovided in any suitable container. For example, a kit may provide oneor more reaction or storage buffers. Reagents may be provided in a formthat is usable in a particular assay, or in a form that requiresaddition of one or more other components before use (e.g., inconcentrate or lyophilized form). A buffer can be any buffer, includingbut not limited to a sodium carbonate buffer, a sodium bicarbonatebuffer, a borate buffer, a Tris buffer, a MOPS buffer, a HEPES buffer,and combinations thereof. In some embodiments, the buffer is alkaline.In some embodiments, the buffer has a pH from about 7 to about 10. Insome embodiments, the kit comprises one or more oligonucleotidescorresponding to a guide sequence for insertion into a vector so as tooperably link the guide or crRNA sequence and a regulatory element. Insome embodiments, the kit comprises a homologous recombination templatepolynucleotide. In some embodiments, the kit comprises one or more ofthe vectors and/or one or more of the polynucleotides described herein.The kit may advantageously allow to provide all elements of the systemsof the invention.

The present application also provides aspects and embodiments as setforth in the following numbered Statements:

1. An engineered CRISPR-Cas protein comprising one or more HEPN domainsand further comprising one or more modified amino acids, wherein theamino acids: a) interact with a guide RNA that forms a complex with theengineered CRISPR-Cas protein; b) are in a HEPN active site, aninter-domain linker domain, a lid domain, a helical domain 1, a helicaldomain 2, or a bridge helix domain of the engineered CRISPR-Cas protein;or c) a combination thereof.2. The engineered CRISPR-Cas protein of statement 1, wherein the HEPNdomain comprises a RxxxxH motif.3. The engineered CRISPR-Cas protein of statement 1 or 2, wherein theRxxxxH motif comprises a R{N/H/K}X₁X₂X₃H sequence.4. The engineered CRISPR-Cas protein of any one of preceding statements,wherein: X₁ is R, S, D, E, Q, N, G, or Y; X₂ is independently I, S, T,V, or L; and X₃ is independently L, F, N, Y, V, I, S, D, E, or A.5. The engineered CRISPR-Cas protein of any one of preceding statements,wherein the CRISPR-Cas protein is a Type VI CRISPR-Cas protein.6. The engineered CRISPR-Cas protein of any one of preceding statements,wherein the Type VI CRISPR-Cas protein is Cas13.7. The engineered CRISPR-Cas protein of any one of preceding statements,wherein the Type VI CRISPR-Cas protein is Cas13a, Cas13b, Cas13c, orCas13d.8. The engineered CRISPR-Cas protein of any one of preceding statements,comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): T405,H407, K457, H500, K570, K590, N634, R638, N652, N653, K655, S658, K741,K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193, R600,K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835, K836,R838, R618, D434, K431, R53, K943, R1041, Y164, R285, R287, K292, E296,N297, Q646, N647, R402, K393, N653, N652, R482, N480, D396, E397, D398,E399, K294, E400, R56, N157, H161, H452, N455, K484, N486, G566, H567,A656, V795, A796, W842, K871, E873, R874, R1068, N1069, or H1073.9. The engineered CRISPR-Cas protein of any one of preceding statements,comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): H407,K457, H500, K570, K590, N634, R638, N652, N653, K655, S658, K741, K744,N756, S757, R762, R791, K846, K857, K870, R877, K183, K193, R600, K607,K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835, K836, R838,R618, D434, K431, R53, K943, R1041, Y164, R285, R287, K292, E296, N297,Q646, N647, R402, K393, N653, N652, R482, N480, D396, E397, D398, E399,K294, E400, R56, N157, H161, H452, N455, K484, N486, G566, H567, W842,K871, E873, R874, R1068, N1069, or H1073.10. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): T405,H407, K457, H500, K570, K590, N634, R638, N652, N653, K655, S658, K741,K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193, R600,K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835, K836,R838, R618, D434, K431, R53, K943, R1041, Y164, R285, R287, K292, E296,N297, Q646, N647, R402, K393, N653, N652, R482, N480, D396, E397, D398,E399, K294, or E400.11. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K393,R402, N482, T405, H407, S658, N653, A656, K655, N652, H567, N455, H500,K871, K857, K870, W842, E873, R877, K846, R874, R762, V795, A796, R791,G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, K741, R56,N157, H161, R1068, N1069, or H1073.12. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K393, R402, N482, H407, S658, N653,K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, R791, G566, K590, R638, H452, S757, N756, N486, K484, N480,K457, K741, R56, N157, H161, R1068, N1069, or H1073.13. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: W842, K846, K870, E873, or R877.14. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: W842, K846, K870, E873, or R877.15. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPbCas13b: W842, K846, K870, E873, or R877.16. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the bridge helix domain one or more mutation of an aminoacid corresponding to the following amino acids in the bridge helixdomain of PbCas13b: W842, K846, K870, E873, or R877.17. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K393, R402, N480, N482, N652, orN653.18. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K393, R402, N480, or N482.19. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N480, or N482.20. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: N652 or N653.21. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPbCas13b: N652 or N653.22. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: T405, H407, S658, N653, A656, K655,N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, V795, A796, R791, G566, K590, R638, H452, S757, N756, N486, K484,N480, K457, or K741.23. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H407, S658, N653, K655, N652, H567,N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791,G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, or K741.24. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, A656, K655, N652, H567,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, V795, A796,R791, G566, K590, R638, S757, N756, or K741.25. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in a helical domain one or more mutation of an amino acidcorresponding to the following amino acids in a helical domain ofPbCas13b: S658, N653, A656, K655, N652, H567, H500, K871, K857, K870,W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590, R638,S757, N756, or K741.26. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, K871, K857, K870, W842,E873, R877, K846, R874, R762, V795, A796, R791, G566, S757, or N756.27. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: H567, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, V795, A796, R791, G566, S757, or N756.28. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, R762, V795, A796, R791,G566, S757, or N756.29. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: H567, H500, R762, V795, A796, R791, G566, S757, or N756.30. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K871, K857, K870, W842, E873, R877,K846, or R874.31. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the bridge helix domain one or more mutation of an aminoacid corresponding to the following amino acids in the bridge helixdomain of PbCas13b: K871, K857, K870, W842, E873, R877, K846, or R874.32. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, or G566.33. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1-2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-2 ofPbCas13b: H567, H500, or G566.34. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K871, K857, K870, W842, E873, R877,K846, R874, R762, V795, A796, R791, S757, or N756.35. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPbCas13b: K871, K857, K870, W842, E873, R877, K846, R874, R762, V795,A796, R791, S757, or N756.36. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R762, V795, A796, R791, S757, orN756.37. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPbCas13b: R762, V795, A796, R791, S757, or N756.38. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, A656, K655, N652, K590,R638, or K741.39. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPbCas13b: S658, N653, A656, K655, N652, K590, R638, or K741.40. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: T405, H407, N486, K484, N480, H452,N455, or K457.41. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: T405, H407, N486, K484, N480, H452, N455, or K457.42. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, K655, N652, H567, H500,K871, K857, K870, W842, E873, R877, K846, R874, R762, R791, G566, K590,R638, S757, N756, or K741.43. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in a helical domain one or more mutation of an amino acidcorresponding to the following amino acids in a helical domain ofPbCas13b: S658, N653, K655, N652, H567, H500, K871, K857, K870, W842,E873, R877, K846, R874, R762, R791, G566, K590, R638, S757, N756, orK741.44. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, K871, K857, K870, W842,E873, R877, K846, R874, R762, R791, G566, S757, or N756.45. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: H567, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, R791, G566, S757, or N756.46. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, R762, R791, G566, S757,or N756.47. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: H567, H500, R762, R791, G566, S757, or N756.48. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K871, K857, K870, W842, E873, R877,K846, R874, R762, R791, S757, or N756.49. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPbCas13b: K871, K857, K870, W842, E873, R877, K846, R874, R762, R791,S757, or N756.50. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R762, R791, S757, or N756.51. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPbCas13b: R762, R791, S757, or N756.52. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, K655, N652, K590, R638,or K741.53. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPbCas13b: S658, N653, K655, N652, K590, R638, or K741.54. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H407, N486, K484, N480, H452, N455,or K457.55. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: H407, N486, K484, N480, H452, N455, or K457.56. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R56, N157, H161, R1068, N1069, orH1073.57. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain of PbCas13b:R56, N157, H161, R1068, N1069, or H1073.58. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R56, N157, or H161.59. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 of PbCas13b:R56, N157, or H161.60. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R1068, N1069, or H1073.61. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 of PbCas13b:R1068, N1069, or H1073.62. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K393, R402, N482, T405, H407, N486,K484, N480, H452, N455, or K457.63. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N482, T405, H407, N486, K484, N480, H452, N455, orK457.64. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K393, R402, N482, H407, N486, K484,N480, H452, N455, or K457.65. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N482, H407, N486, K484, N480, H452, N455, or K457.66. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: T405, H407, S658, N653, A656, K655,N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, V795, A796, R791, G566, K590, R638, H452, S757, N756, N486, K484,N480, K457, K741, K393, R402, or N482.67. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H407, S658, N653, K655, N652, H567,N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791,G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, K741, K393,R402, or N482.68. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, A656, K655, N652, H567,N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, V795,A796, R791, G566, K590, R638, H452, S757, N756, N486, K484, N480, K457,or K741.69. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, K655, N652, H567, N455,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791, G566,K590, R638, H452, S757, N756, N486, K484, N480, K457, or K741.70. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: N486, K484, N480, H452, N455, orK457.71. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: N486, K484, N480, H452, N455, or K457.72. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: K393, R402, N482, N486, K484, N480,H452, N455, or K457.73. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N482, N486, K484, N480, H452, N455, or K457.74. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, A656, K655, N652, H567,N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, V795,A796, R791, G566, K590, R638, H452, S757, N756, N486, K484, N480, K457,K741, K393, R402, or N482.75. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: S658, N653, K655, N652, H567, N455,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791, G566,K590, R638, H452, S757, N756, N486, K484, N480, K457, K741, K393, R402,or N482.76. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K943, or R1041.77. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53 orY164.78. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K943 orR1041.79. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K943, or R1041.80. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53 or Y164.81. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943 or R1041.82. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K943, R1041, R56, N157, H161, R1068, N1069, or H1073.83. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,R56, N157, or H161.84. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K943,R1041, R1068, N1069, or H1073.85. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K943, R1041, R56, N157,H161, R1068, N1069, or H1073.86. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, R56, N157, or H161.87. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943, R1041, R1068, N1069, orH1073.88. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K183, K193, K943, or R1041.89. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K183, or K193.90. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K943 orR1041.91. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943, orR1041.92. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, or K193.93. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943 or R1041.94. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K183, K193, K943, R1041, R56, N157, H161, R1068, N1069, or H1073.95. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K183, K193, R56, N157, or H161.96. The engineered CRISPR-Cas protein of any one of preceding statementscomprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K943,R1041, R1068, N1069, or H1073.97. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943, R1041,R56, N157, H161, R1068, N1069, or H1073.98. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, R56, N157,or H161.99. The engineered CRISPR-Cas protein of any one of preceding statementscomprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943, R1041, R1068, N1069, orH1073.100. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K183 or K193.101. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in HEPN domain 1 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): K183 or K193.102. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K943, or R1041.103. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K943, or R1041.104. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53, K943, or R1041; preferably R53A, R53K, R53D, or R53E;K943A, K943R, K943D, or K943E; or R1041A, R1041K, R1041D, or R1041E.105. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, K943, or R1041; preferablyR53A, R53K, R53D, or R53E; K943A, K943R, K943D, or K943E; or R1041A,R1041K, R1041D, or R1041E.106. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid Y164 of Prevotella buccae Cas13b (PbCas13b), preferably Y164A,Y164F, or Y164W.107. The engineered CRISPR-Cas protein of any one of precedingstatements comprising HEPN domain 1 a mutation of an amino acidcorresponding to amino acid Y164 HEPN domain 1 of Prevotella buccaeCas13b (PbCas13b), preferably Y164A, Y164F, or Y164W.108. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): T405, H407, K457, D434, K431, R402, K393, R482, N480, D396,E397, D398, or E399.109. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPrevotella buccae Cas13b (PbCas13b): T405, H407, K457, D434, K431, R402,K393, R482, N480, D396, E397, D398, or E399.110. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H407 of Prevotella buccae Cas13b (PbCas13b), preferably H407Y,H407W, or H407F.111. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R402, K393, R482, N480, D396, E397, D398, or E399.112. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPrevotella buccae Cas13b (PbCas13b): R402, K393, R482, N480, D396, E397,D398, or E399.113. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K457, D434, or K431.114. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPrevotella buccae Cas13b (PbCas13b): K457, D434, or K431.115. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500, K570, K590, N634, R638, N652, N653, K655, S658, K741,K744, N756, S757, R762, R791, K846, K857, K870, R877, R600, K607, K612,R614, K617, K826, K828, K829, R824, R830, Q831, K835, K836, R838, R618,Q646, N647, N653, or N652.116. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a helical domain one or more mutation of anamino acid corresponding to the following amino acids in a helicaldomain of Prevotella buccae Cas13b (PbCas13b): H500, K570, K590, N634,R638, N652, N653, K655, S658, K741, K744, N756, S757, R762, R791, K846,K857, K870, R877, R600, K607, K612, R614, K617, K826, K828, K829, R824,R830, Q831, K835, K836, R838, R618, Q646, N647, N653, or N652.117. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500, K570, N756, S757, R762, R791, K846, K857, K870, R877,K826, K828, K829, R824, R830, Q831, K835, K836, or R838.118. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 1 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1 of Prevotella buccae Cas13b (PbCas13b): H500, K570, N756, S757, R762,R791, K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831, K835,K836, or R838.119. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500, K570, N756, S757, R762, or R791.120. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 1 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1 of Prevotella buccae Cas13b (PbCas13b): H500, K570, N756, S757, R762,or R791.121. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831,K835, K836, or R838.122. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the bridge helix domain one or more mutation ofan amino acid corresponding to the following amino acids in the bridgehelix domain of Prevotella buccae Cas13b (PbCas13b): K846, K857, K870,R877, K826, K828, K829, R824, R830, Q831, K835, K836, or R838.123. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H500 or K570.124. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 1-2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1-2 of Prevotella buccae Cas13b (PbCas13b): H500 or K570.125. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N756, S757, R762, R791, K846, K857, K870, R877, K826, K828,K829, R824, R830, Q831, K835, K836, or R838.126. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 1-3 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1-3 of Prevotella buccae Cas13b (PbCas13b): N756, S757, R762, R791,K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831, K835, K836,or R838.127. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N756, S757, R762, or R791.128. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 1-3 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1-3 of Prevotella buccae Cas13b (PbCas13b): N756, S757, R762, or R791.129. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N756, S757, R762, R791, K846, K857, K870, or R877.130. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 1-3 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1-3 of Prevotella buccae Cas13b (PbCas13b): N756, S757, R762, R791,K846, K857, K870, or R877.131. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K826, K828, K829, R824, R830, Q831, K835, K836, or R838.132. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 1-3 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1-3 of Prevotella buccae Cas13b (PbCas13b): K826, K828, K829, R824,R830, Q831, K835, K836, or R838.133. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K590, N634, R638, N652, N653, K655, S658, K741, K744, R600,K607, K612, R614, K617, R618, Q646, N647, N653, or N652.134. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): K590, N634, R638, N652, N653,K655, S658, K741, K744, R600, K607, K612, R614, K617, R618, Q646, N647,N653, or N652.135. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): Q646 or N647.136. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): Q646 or N647.137. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N653 or N652.138. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): N653 or N652.139. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K590, N634, R638, N652, N653, K655, S658, K741, or K744.140. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): K590, N634, R638, N652, N653,K655, S658, K741, or K744.141. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R600, K607, K612, R614, K617, or R618.142. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): R600, K607, K612, R614, K617,or R618.143. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R285, R287, K292, E296, N297, or K294.144. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the IDL domain one or more mutation of an aminoacid corresponding to the following amino acids in the IDL domain ofPrevotella buccae Cas13b (PbCas13b): R285, R287, K292, E296, N297, orK294.145. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R285, K292, E296, or N297.146. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the IDL domain one or more mutation of an aminoacid corresponding to the following amino acids in the IDL domain ofPrevotella buccae Cas13b (PbCas13b): R285, K292, E296, or N297.147. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): T405, H500, K570, K590, N634, R638, N652, N653, K655, S658,K741, K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193,R600, K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835,K836, R838, R618, D434, K431, R285, R287, K292, E296, N297, Q646, N647,or K294.148. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R402, K393, N653, N652, R482, N480, D396, E397, D398, orE399.149. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53, K655, R762, or R1041; preferably R53A or R53D; K655A;R762A; or R1041E or R1041D.150. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N297, E296, K292, or R285; preferably N297A, E296A, K292A,or R285A.151. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in (the central channel of) the IDL domain one ormore mutation of an amino acid corresponding to the following aminoacids in (the central channel of) the IDL domain of Prevotella buccaeCas13b (PbCas13b): N297, E296, K292, or R285; preferably N297A, E296A,K292A, or R285A.152. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): Q831, K836, R838, N652, N653, R830, K655 or R762; preferablyQ831A, K836A, R838A, N652A, N653A, R830A, K655A, or R762A.153. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N652, N653, R830, K655 or R762; preferably N652A, N653A,R830A, K655A, or R762A.154. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K655 or R762; preferably K655A or R762A.155. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a helical domain one or more mutation of anamino acid corresponding to the following amino acids in a helicaldomain of Prevotella buccae Cas13b (PbCas13b): Q831, K836, R838, N652,N653, R830, K655 or R762; preferably Q831A, K836A, R838A, N652A, N653A,R830A, K655A, or R762A.156. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a helical domain one or more mutation of an aminoacid corresponding to the following amino acids a helical domain ofPrevotella buccae Cas13b (PbCas13b): N652, N653, R830, K655 or R762;preferably N652A, N653A, R830A, K655A, or R762A.157. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in helical domain 2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): K655 or R762; preferably K655Aor R762A.158. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R614, K607, K193, K183 or R600; preferably R614A, K607A,K193A, K183A or R600A.159. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the trans-subunit loop of helical domain 2 oneor more mutation of an amino acid corresponding to the following aminoacids in the trans-subunit loop of helical domain 2 of Prevotella buccaeCas13b (PbCas13b): Q646 or N647; preferably Q646A or N647A.160. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53 or R1041; preferably R53A or R53D, or R1041E or R1041D.161. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53 or R1041; preferably R53A orR53D, or R1041E or R1041D.162. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K457, D397, E398, D399, E400, T405, H407 or D434; preferablyD397A, E398A, D399A, E400A, T405A, H407A, H407W, H407Y, H407F or D434A.163. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPrevotella buccae Cas13b (PbCas13b): K457, D397, E398, D399, E400, T405,H407 or D434; preferably D397A, E398A, D399A, E400A, T405A, H407A,H407W, H407Y, H407F or D434A.164. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein the amino acids correspond to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): amino acids 46-57, 73-79,152-164, 1036-1046, and 1064-1074.165. The engineered CRISPR-Cas protein of any one of precedingstatements, comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R156, N157, H161, R1068, N1069, and H1073.166. The engineered CRISPR-Cas protein of any one of precedingstatements, comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R285, R287, K292, K294, E296, and N297.167. The engineered CRISPR-Cas protein of any one of precedingstatements, comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): K826, K828, K829, R824, R830, Q831, K835, K836, and R838.168. The engineered CRISPR-Cas protein of any one of precedingstatements, comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): T405, H407, K457, H500, K570, K590, N634, R638, N652, N653,K655, S658, K741, K744, N756, S757, R762, R791, K846, K857, K870, andR877.169. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid T405 of Prevotella buccae Cas13b (PbCas13b).170. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H407 of Prevotella buccae Cas13b (PbCas13b).171. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K457 of Prevotella buccae Cas13b (PbCas13b).172. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H500 of Prevotella buccae Cas13b (PbCas13b).173. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K570 of Prevotella buccae Cas13b (PbCas13b).174. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K590 of Prevotella buccae Cas13b (PbCas13b).175. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N634 of Prevotella buccae Cas13b (PbCas13b).176. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R638 of Prevotella buccae Cas13b (PbCas13b).177. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N652 of Prevotella buccae Cas13b (PbCas13b).178. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N653 of Prevotella buccae Cas13b (PbCas13b).179. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K655 of Prevotella buccae Cas13b (PbCas13b).180. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid S658 of Prevotella buccae Cas13b (PbCas13b).181. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K741 of Prevotella buccae Cas13b (PbCas13b).182. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K744 of Prevotella buccae Cas13b (PbCas13b).183. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N756 of Prevotella buccae Cas13b (PbCas13b).184. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid S757 of Prevotella buccae Cas13b (PbCas13b).185. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R762 of Prevotella buccae Cas13b (PbCas13b).186. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R791 of Prevotella buccae Cas13b (PbCas13b).187. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K846 of Prevotella buccae Cas13b (PbCas13b).188. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K857 of Prevotella buccae Cas13b (PbCas13b).189. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K870 of Prevotella buccae Cas13b (PbCas13b).190. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R877 of Prevotella buccae Cas13b (PbCas13b).191. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K183 of Prevotella buccae Cas13b (PbCas13b).192. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K193 of Prevotella buccae Cas13b (PbCas13b).193. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R600 of Prevotella buccae Cas13b (PbCas13b).194. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K607 of Prevotella buccae Cas13b (PbCas13b).195. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K612 of Prevotella buccae Cas13b (PbCas13b).196. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R614 of Prevotella buccae Cas13b (PbCas13b).197. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K617 of Prevotella buccae Cas13b (PbCas13b).198. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K826 of Prevotella buccae Cas13b (PbCas13b).199. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K828 of Prevotella buccae Cas13b (PbCas13b).200. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K829 of Prevotella buccae Cas13b (PbCas13b).201. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R824 of Prevotella buccae Cas13b (PbCas13b).202. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R830 of Prevotella buccae Cas13b (PbCas13b).203. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid Q831 of Prevotella buccae Cas13b (PbCas13b).204. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K835 of Prevotella buccae Cas13b (PbCas13b).205. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K836 of Prevotella buccae Cas13b (PbCas13b).206. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R838 of Prevotella buccae Cas13b (PbCas13b).207. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R618 of Prevotella buccae Cas13b (PbCas13b).208. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid D434 of Prevotella buccae Cas13b (PbCas13b).209. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K431 of Prevotella buccae Cas13b (PbCas13b).210. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R53 of Prevotella buccae Cas13b (PbCas13b).211. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K943 of Prevotella buccae Cas13b (PbCas13b).212. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R1041 of Prevotella buccae Cas13b (PbCas13b).213. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid Y164 of Prevotella buccae Cas13b (PbCas13b).214. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R285 of Prevotella buccae Cas13b (PbCas13b).215. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R287 of Prevotella buccae Cas13b (PbCas13b).216. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K292 of Prevotella buccae Cas13b (PbCas13b).217. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid E296 of Prevotella buccae Cas13b (PbCas13b).218. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N297 of Prevotella buccae Cas13b (PbCas13b).219. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid Q646 of Prevotella buccae Cas13b (PbCas13b).220. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N647 of Prevotella buccae Cas13b (PbCas13b).221. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R402 of Prevotella buccae Cas13b (PbCas13b).222. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K393 of Prevotella buccae Cas13b (PbCas13b).223. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N653 of Prevotella buccae Cas13b (PbCas13b).224. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N652 of Prevotella buccae Cas13b (PbCas13b).225. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R482 of Prevotella buccae Cas13b (PbCas13b).226. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N480 of Prevotella buccae Cas13b (PbCas13b).227. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid D396 of Prevotella buccae Cas13b (PbCas13b).228. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid E397 of Prevotella buccae Cas13b (PbCas13b).229. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid D398 of Prevotella buccae Cas13b (PbCas13b).230. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid E399 of Prevotella buccae Cas13b (PbCas13b).231. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K294 of Prevotella buccae Cas13b (PbCas13b).232. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid E400 of Prevotella buccae Cas13b (PbCas13b).233. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R56 of Prevotella buccae Cas13b (PbCas13b).234. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N157 of Prevotella buccae Cas13b (PbCas13b).235. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H161 of Prevotella buccae Cas13b (PbCas13b).236. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H452 of Prevotella buccae Cas13b (PbCas13b).237. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N455 of Prevotella buccae Cas13b (PbCas13b).238. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K484 of Prevotella buccae Cas13b (PbCas13b).239. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N486 of Prevotella buccae Cas13b (PbCas13b).240. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid G566 of Prevotella buccae Cas13b (PbCas13b).241. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H567 of Prevotella buccae Cas13b (PbCas13b).242. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid A656 of Prevotella buccae Cas13b (PbCas13b).243. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid V795 of Prevotella buccae Cas13b (PbCas13b).244. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid A796 of Prevotella buccae Cas13b (PbCas13b).245. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid W842 of Prevotella buccae Cas13b (PbCas13b).246. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid K871 of Prevotella buccae Cas13b (PbCas13b).247. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid E873 of Prevotella buccae Cas13b (PbCas13b).248. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R874 of Prevotella buccae Cas13b (PbCas13b).249. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid R1068 of Prevotella buccae Cas13b (PbCas13b).250. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid N1069 of Prevotella buccae Cas13b (PbCas13b).251. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H1073 of Prevotella buccae Cas13b (PbCas13b).252. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Leptotrichia shahii Cas13a(LshCas13a): R597, N598, H602, R1278, N1279, or H1283.253. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Leptotrichia shahii Cas13a(LshCas13a): R597, N598, H602, R1278, N1279, or H1283.254. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofLeptotrichia shahii Cas13a (LshCas13a): R597, N598, H602, R1278, N1279,or H1283.255. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Leptotrichia shahii Cas13a(LshCas13a): R597, N598, or H602.256. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in HEPN domain 1 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 1 ofLeptotrichia shahii Cas13a (LshCas13a): R597, N598, or H602.257. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Leptotrichia shahii Cas13a(LshCas13a): R1278, N1279, or H1283.258. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in HEPN domain 2 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 2 ofLeptotrichia shahii Cas13a (LshCas13a): R1278, N1279, or H1283.259. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R146, H151, R1116, or H1121.260. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R146, H151, R1116, or H1121.261. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPorphyromonas gulae Cas13b (PguCas13b): R146, H151, R1116, or H1121.262. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R146 or H151.263. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in HEPN domain 1 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 1 ofPorphyromonas gulae Cas13b (PguCas13b): R146 or H151.264. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R1116 or H1121.265. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in HEPN domain 2 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 2 ofPorphyromonas gulae Cas13b (PguCas13b): R1116 or H1121.266. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella sp. P5-125Cas13b (PspCas13b): H133 or H1058.267. The engineered CRISPR-Cas protein of any one of precedingstatements comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella sp. P5-125Cas13b (PspCas13b): H133 or H1058.268. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella sp. P5-125 Cas13b (PspCas13b): H133 or H1058.269. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H133 of Prevotella sp. P5-125 Cas13b (PspCas13b).270. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in HEPN domain 1 a mutation of an amino acidcorresponding to amino acid H133 in HEPN domain 1 of Prevotella sp.P5-125 Cas13b (PspCas13b).271. The engineered CRISPR-Cas protein of any one of precedingstatements comprising a mutation of an amino acid corresponding to aminoacid H1058 of Prevotella sp. P5-125 Cas13b (PspCas13b).272. The engineered CRISPR-Cas protein of any one of precedingstatements comprising in HEPN domain 2 a mutation of an amino acidcorresponding to the amino acid H1058 in HEPN domain 2 of Prevotella sp.P5-125 Cas13b (PspCas13b).273. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to A, P, or V, preferably A.274. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to a hydrophobic amino acid.275. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to an aromatic amino acid.276. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to a charged amino acid.277. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to a positively charged amino acid.278. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to a negatively charged amino acid.279. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to a polar amino acid.280. The engineered CRISPR-Cas protein of any of statements 8 to 272,wherein said amino acid is mutated to an aliphatic amino acid.

281. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein said Cas13 protein is or originates from a speciesof the genus Alistipes, Anaerosalibacter, Bacteroides, Bacteroidetes,Bergeyella, Blautia, Butyrivibrio, Capnocytophaga, Carnobacterium,Chloroflexus, Chryseobacterium, Clostridium, Demequina, Eubacteriaceae,Eubacterium, Flavobacterium, Fusobacterium, Herbinix, Insolitispirillum,Lachnospiraceae, Leptotrichia, Listeria, Myroides, Paludibacter,Phaeodactylibacter, Porphyromonadaceae, Porphyromonas, Prevotella,Pseudobutyrivibrio, Psychroflexus, Reichenbachiella, Rhodobacter,Riemerella, Sinomicrobium, Thalassospira, Ruminococcus; preferablyLeptotrichia shahii, Listeria seeligeri, Lachnospiraceae bacterium (suchas Lb MA2020, Lb NK4A179, Lb NK4A144), Clostridium aminophilum (such asCa DSM 10710), Carnobacterium gallinarum (such as Cg DSM 4847),Paludibacter propionicigenes (such as Pp WB4), Listeriaweihenstephanensis (such as Lw FSL R9-0317), Listeriaceae bacterium(such as Lb FSL M6-0635), Leptotrichia wadei (such as Lw F0279),Rhodobacter capsulatus (such as Rc SB 1003, Rc R121, Rc DE442),Leptotrichia buccalis (such as Lb C-1013-b), Herbinixhemicellulosilytica, Eubacteriaceae bacterium (such as Eb CHKCI004),Blautia. sp Marseille-P2398, Leptotrichia sp. oral taxon 879 str. F0557,Chloroflexus aggregans, Demequina aurantiaca, Thalassospira sp. TSL5-1,Pseudobutyrivibrio sp. OR37, Butyrivibrio sp. YAB3001, Leptotrichia sp.Marseille-P3007, Bacteroides ihuae, Porphyromonadaceae bacterium (suchas Pb KH3CP3RA), Listeria riparia, Insolitispirillum peregrinum,Alistipes sp. ZOR0009, Bacteroides pyogenes (such as Bp F0041),Bacteroidetes bacterium (such as Bb GWA2_31_9), Bergeyella zoohelcum(such as Bz ATCC 43767), Capnocytophaga canimorsus, Capnocytophagacynodegmi, Chryseobacterium carnipullorum, Chryseobacterium jejuense,Chryseobacterium ureilyticum, Flavobacterium branchiophilum,Flavobacterium columnare, Flavobacterium sp. 316, Myroides odoratimimus(such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG 3837), Paludibacterpropionicigenes, Phaeodactylibacter xiamenensis, Porphyromonasgingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, Pg W4087,Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60,Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, Sinomicrobium oceani,Fusobacterium necrophorum (such as Fn subsp. funduliforme ATCC 51357, FnDJ-2, Fn BFTR-1, Fn subsp. funduliforme), Fusobacterium perfoetens (suchas Fp ATCC 29250), Fusobacterium ulcerans (such as Fu ATCC 49185),Anaerosalibacter sp. ND1, Eubacterium siraeum, Ruminococcus flavefaciens(such as Rfx XPD3002), or Ruminococcus albus.

282. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein said Cas13 protein is a Cas13a protein.283. The engineered CRISPR-Cas protein of statement 282, wherein saidCas13a protein is or originates from a species of the genus Bacteroides,Blautia, Butyrivibrio, Carnobacterium, Chloroflexus, Clostridium,Demequina, Eubacterium, Herbinix, Insolitispirillum, Lachnospiraceae,Leptotrichia, Listeria, Paludibacter, Porphyromonadaceae,Pseudobutyrivibrio, Rhodobacter, or Thalassospira; preferablyLeptotrichia shahii, Listeria seeligeri, Lachnospiraceae bacterium (suchas Lb MA2020, Lb NK4A179, Lb NK4A144), Clostridium aminophilum (such asCa DSM 10710), Carnobacterium gallinarum (such as Cg DSM 4847),Paludibacter propionicigenes (such as Pp WB4), Listeriaweihenstephanensis (such as Lw FSL R9-0317), Listeriaceae bacterium(such as Lb FSL M6-0635), Leptotrichia wadei (such as Lw F0279),Rhodobacter capsulatus (such as Rc SB 1003, Rc R121, Rc DE442),Leptotrichia buccalis (such as Lb C-1013-b), Herbinixhemicellulosilytica, Eubacteriaceae bacterium (such as Eb CHKCI004),Blautia. sp Marseille-P2398, Leptotrichia sp. oral taxon 879 str. F0557,Chloroflexus aggregans, Demequina aurantiaca, Thalassospira sp. TSL5-1,Pseudobutyrivibrio sp. OR37, Butyrivibrio sp. YAB3001, Leptotrichia sp.Marseille-P3007, Bacteroides ihuae, Porphyromonadaceae bacterium (suchas Pb KH3CP3RA), Listeria riparia, or Insolitispirillum peregrinum.284. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein said Cas13 protein is a Cas13b protein.285. The engineered CRISPR-Cas protein of statement 284, wherein saidCas13b protein is or originates from a species of the genus Alistipes,Bacteroides, Bacteroidetes, Bergeyella, Capnocytophaga,Chryseobacterium, Flavobacterium, Myroides, Paludibacter,Phaeodactylibacter, Porphyromonas, Prevotella, Psychroflexus,Reichenbachiella, Riemerella, or Sinomicrobium; preferably Alistipes sp.ZOR0009, Bacteroides pyogenes (such as Bp F0041), Bacteroidetesbacterium (such as Bb GWA2_31_9), Bergeyella zoohelcum (such as Bz ATCC43767), Capnocytophaga canimorsus, Capnocytophaga cynodegmi,Chryseobacterium carnipullorum, Chryseobacterium jejuense,Chryseobacterium ureilyticum, Flavobacterium branchiophilum,Flavobacterium columnare, Flavobacterium sp. 316, Myroides odoratimimus(such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG 3837), Paludibacterpropionicigenes, Phaeodactylibacter xiamenensis, Porphyromonasgingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, Pg W4087,Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60, Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, or Sinomicrobium oceani.286. The engineered CRISPR-Cas protein any one of preceding statements,wherein said Cas13 protein is a Cas13c protein.287. The engineered CRISPR-Cas protein of statement 286, wherein saidCas13c protein is or originates from a species of the genusFusobacterium or Anaerosalibacter; preferably Fusobacterium necrophorum(such as Fn subsp. funduliforme ATCC 51357, Fn DJ-2, Fn BFTR-1, Fnsubsp. funduliforme), Fusobacterium perfoetens (such as Fp ATCC 29250),Fusobacterium ulcerans (such as Fu ATCC 49185), or Anaerosalibacter sp.ND1.288. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein said Cas13 protein is a Cas13d protein.289. The engineered CRISPR-Cas protein of statement 288, wherein saidCas13d protein is originates from a species of the genus Eubacterium orRuminococcus, preferably Eubacterium siraeum, Ruminococcus flavefaciens(such as Rfx XPD3002), or Ruminococcus albus.290. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein catalytic activity of the engineered CRISPR-Casprotein is increased as compared to a corresponding wildtype CRISPR-Casprotein.291. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein catalytic activity of the engineered CRISPR-Casprotein is decreased as compared to a corresponding wildtype CRISPR-Casprotein.292. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein gRNA binding of the engineered CRISPR-Cas protein isincreased as compared to a corresponding wildtype CRISPR-Cas protein.293. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein gRNA binding of the engineered CRISPR-Cas protein isdecreased as compared to a corresponding wildtype CRISPR-Cas protein.294. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein specificity of the CRISPR-Cas protein is increasedas compared to a corresponding wildtype CRISPR-Cas protein.295. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein specificity of the CRISPR-Cas protein is decreasedas compared to a corresponding wildtype CRISPR-Cas protein.296. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein stability of the CRISPR-Cas protein is increased ascompared to a corresponding wildtype CRISPR-Cas protein.297. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein stability of the CRISPR-Cas protein is decreased ascompared to a corresponding wildtype CRISPR-Cas protein.298. The engineered CRISPR-Cas protein of any one of precedingstatements, further comprising one or more mutations which inactivatecatalytic activity.299. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein off-target binding of the CRISPR-Cas protein isincreased as compared to a corresponding wildtype CRISPR-Cas protein.300. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein off-target binding of the CRISPR-Cas protein isdecreased as compared to a corresponding wildtype CRISPR-Cas protein.301. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein target binding of the CRISPR-Cas protein isincreased as compared to a corresponding wildtype CRISPR-Cas protein.302. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein target binding of the CRISPR-Cas protein isdecreased as compared to a corresponding wildtype CRISPR-Cas protein.303. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein the engineered CRISPR-Cas protein has a higherprotease activity or polynucleotide-binding capability compared to acorresponding wildtype CRISPR-Cas protein.304. The engineered CRISPR-Cas protein of any one of precedingstatements, wherein PFS recognition is altered as compared to acorresponding wildtype CRISPR-Cas protein.305. The engineered CRISPR-Cas protein of any one of precedingstatements, further comprising a functional heterologous domain.306. The engineered CRISPR-Cas protein of any one of precedingstatements, further comprising an NLS.307. The engineered CRISPR-Cas protein of any one of precedingstatements, further comprising a NES.308. An engineered CRISPR-Cas protein comprising one or more HEPNdomains and is less than 1000 amino acids in length.309. The engineered CRISPR-Cas protein of statement 308, wherein theprotein is less than 950, less than 900, less than 850, less than 800,less, or than 750 amino acids in size.310. The engineered CRISPR-Cas protein of statement 308 or 309, whereinthe HEPN domain comprises a RxxxxH motif.311. The engineered CRISPR-Cas protein of statement 310, wherein theRxxxxH motif comprises a R[N/H/K]X₁X₂X₃H sequence.312. The engineered CRISPR-Cas protein of statement 311, wherein: X₁ isR, S, D, E, Q, N, G, or Y; X₂ is independently I, S, T, V, or L; and X₃is independently L, F, N, Y, V, I, S, D, E, or A.313. The engineered CRISPR-Cas protein of any one of statements 308-313,wherein the CRISPR-Cas protein is a Type VI CRISPR Cas protein.314. The engineered CRISPR Cas protein of statement 313, wherein theType VI CRISPR Cas protein is a Cas13a, a Cas13b, a Cas13c, or a Cas13d.315. The engineered CRISPR-Cas protein of any one of statements 308 to315, wherein the CRISPR-Cas protein is associated with a functionaldomain.316. The engineered CRISPR-Cas protein of any one of statements 308 to316, wherein the CRISPR-Cas protein comprises one or more mutationsequivalent to mutations in any one of statements [1386]57-[1386]329.317. The engineered CRISPR-Cas protein of statement 316, wherein theCRISPR-Cas protein comprises one or more mutations in the helicaldomain.318. The engineered CRISPR-Cas protein of any one of statements 308 to318, wherein the CRISPR-Cas protein is in a dead form or has nickaseactivity.319. A polynucleotide encoding the engineered CRISPR-Cas protein of anyof statements 1 to 318.320. The polynucleotide according to statement 319, which is codonoptimized.321. A CRISPR-Cas system comprising the engineered CRISPR-Cas protein ofany of statements 1 to [1386]367 or the polynucleotide of statement 318or 319, and a nucleotide component capable of forming a complex with theengineered CRISPR-Cas protein and able to hybridize with a targetnucleic acid sequence and direct sequence-specific binding of saidcomplex to the target nucleic acid sequence.322. A vector system comprising one or more vectors, the one or morevectors comprising one or more polynucleotide molecules encodingcomponents of the engineered CRISPR-Cas protein of statement 321.323. A method of modifying a target nucleic acid comprising: introducingin a cell or organism that comprises the target nucleic acid, theengineered CRISPR-Cas protein according to any of statements 1 to 318,the polynucleic acid according to statement 319 or 320, the CRISPR-Cassystem according to statement 321, or the vector or vector systemaccording to statement 322, such that the engineered CRISPR-Cas proteinmodifies the target nucleic acid in the cell or organism.324. The method of statement [1386]372, wherein the engineeredCRISPR-Cas system is introduced via delivery by liposomes,nanoparticles, exosomes, microvesicles, nucleic acid nanoassemblies, agene gun, an implantable device, or the vector system of statement 322.325. The method of statement 323 or 324, wherein the engineeredCRISPR-cas protein is associated with one or more functional domains.326. The method of any one of statements 323 to 325, wherein the targetnucleic acid comprises a genomic locus, and the engineered CRISPR-Casprotein modifies gene product encoded at the genomic locus or expressionof the gene product.327. The method of any one of statements 323 to 326, wherein the targetnucleic acid is DNA or RNA and wherein one or more nucleotides in thetarget nucleic acid are base edited.328. The method of any one of statements 323 to 327, wherein the targetnucleic acid is DNA or RNA and wherein the target nucleic acid iscleaved.329. The method of statement 328, wherein the engineered CRISPR-Casprotein further cleaves non-target nucleic acid.330. The method of statement 328 or 329, further comprising visualizingactivity and, optionally, using a detectable label.331. The method of any one of statements 328 to 330, further comprisingdetecting binding of one or more components of the CRISPR-Cas system tothe target nucleic acid.332. The method of any one of statements 328 to 331, wherein said cellor organisms is a eukaryotic cell or organism.333. The method of any one of statements 328 to 332, wherein said cellor organisms is an animal cell or organism.334. The method of any one of statements 328 to 333, wherein said cellor organisms is a plant cell or organism.335. A method for detecting a target nucleic acid in a samplecomprising: contacting a sample with: an engineered CRISPR-Cas proteinof any one of statements 1 to 318; at least one guide polynucleotidecomprising a guide sequence capable of binding to the target nucleicacid and designed to form a complex with the engineered CRISPR-Cas; anda RNA-based masking construct comprising a non-target sequence; whereinthe engineered CRISPR-Cas protein exhibits collateral RNase activity andcleaves the non-target sequence of the detection construct; anddetecting a signal from cleavage of the non-target sequence, therebydetecting the target nucleic acid in the sample.336. The method of statement 335, further comprising contacting thesample with reagents for amplifying the target nucleic acid.337. The method of statement 336, wherein the reagents for amplifyingcomprises isothermal amplification reaction reagents.338. The method of statement 337, wherein the isothermal amplificationreagents comprise nucleic-acid sequence-based amplification, recombinasepolymerase amplification, loop-mediated isothermal amplification, stranddisplacement amplification, helicase-dependent amplification, or nickingenzyme amplification reagents.339. The method of any one of statements 335 to 338, wherein the targetnucleic acid is DNA molecule and the method further comprises contactingthe target DNA molecule with a primer comprising an RNA polymerase siteand RNA polymerase.340. The method of any one of statements 335 to 339, wherein the maskingconstruct: suppresses generation of a detectable positive signal untilthe masking construct cleaved or deactivated, or masks a detectablepositive signal or generates a detectable negative signal until themasking construct cleaved or deactivated.341. The method of any one of statements 335 to 340, wherein the maskingconstruct comprises: a. a silencing RNA that suppresses generation of agene product encoded by a reporting construct, wherein the gene productgenerates the detectable positive signal when expressed; b. a ribozymethat generates the negative detectable signal, and wherein the positivedetectable signal is generated when the ribozyme is deactivated; c. aribozyme that converts a substrate to a first color and wherein thesubstrate converts to a second color when the ribozyme is deactivated;d. an aptamer and/or comprises a polynucleotide-tethered inhibitor; e. apolynucleotide to which a detectable ligand and a masking component areattached; f a nanoparticle held in aggregate by bridge molecules,wherein at least a portion of the bridge molecules comprises apolynucleotide, and wherein the solution undergoes a color shift whenthe nanoparticle is disbursed in solution; g. a quantum dot orfluorophore linked to one or more quencher molecules by a linkingmolecule, wherein at least a portion of the linking molecule comprises apolynucleotide; h. a polynucleotide in complex with an intercalatingagent, wherein the intercalating agent changes absorbance upon cleavageof the polynucleotide; or 1. two fluorophores tethered by apolynucleotide that undergo a shift in fluorescence when released fromthe polynucleotide.342. The method of statement 341, wherein the aptamer: a. comprises apolynucleotide-tethered inhibitor that sequesters an enzyme, wherein theenzyme generates a detectable signal upon release from the aptamer orpolynucleotide-tethered inhibitor by acting upon a substrate; or b. isan inhibitory aptamer that inhibits an enzyme and prevents the enzymefrom catalyzing generation of a detectable signal from a substrate orwherein the polynucleotide-tethered inhibitor inhibits an enzyme andprevents the enzyme from catalyzing generation of a detectable signalfrom a substrate; or c. sequesters a pair of agents that when releasedfrom the aptamers combine to generate a detectable signal.343. The method of statement 341 or 342, wherein the nanoparticle is acolloidal metal.344. The method of any one of statements 335 to 343, wherein the atleast one guide polynucleotide comprises a mismatch.345. The method of statement 344, wherein the mismatch is up- ordownstream of a single nucleotide variation on the one or more guidesequences.346. A cell or organism comprising the engineered CRISPR-Cas proteinaccording to any of statements 1 to 318, the polynucleic acid accordingto statement 319 or 320, the CRISPR-Cas system according to statement321, or the vector or vector system according to statement 322.347. An engineered adenosine deaminase comprising one or more mutations,wherein the engineered adenosine deaminase has cytidine deaminaseactivity.348. The engineered adenosine deaminase of statement 347, wherein theengineered adenosine deaminase has adenosine deaminase activity.349. The engineered adenosine deaminase of statement 347 or 348, whereinthe engineered adenosine deaminase is a portion of a fusion protein.350. The engineered adenosine deaminase of statement 349, wherein thefusion protein comprises a functional domain.351. The engineered adenosine deaminase of statement 350, wherein thefunctional domain is capable of directing the engineered adenosinedeaminase to bind to a target nucleic acid.352. The engineered adenosine deaminase of statement 350 or 351, whereinthe functional domain is a CRISPR-Cas protein of any one of statements 1to 318.353. The engineered adenosine deaminase of statement 352, wherein theCRISPR-Cas protein is a dead form CRISPR-Cas protein or CRISPR-Casnickase protein.354. The engineered adenosine deaminase of any one of statements 347 to353, wherein the one or more mutations comprises: E488Q, V351G, S486A,T375S, S370C, P462A, N597I, L332I, I398V, K350I, M383L, D619G, S582T,V440I, S495N, K418E, S661T based on amino acid sequence positions ofhADAR2-D, and corresponding mutations in a homologous ADAR protein.355. The engineered adenosine deaminase of any one of statements 347 to354, wherein the one or more mutations comprises: E488Q, V351G, S486A,T375S, S370C, P462A, N597I, L332I, I398V, K350I, M383L, D619G, S582T,V440I, S495N, K418E, and S661T based on amino acid sequence positions ofhADAR2-D, and corresponding mutations in a homologous ADAR protein.356. A polynucleotide encoding the engineered adenosine deaminase of anyone of statements 347-355, or a catalytic domain thereof.357. A vector comprising the polynucleotide of statement 356.358. A pharmaceutical composition comprising the engineered adenosinedeaminase of any one of statements 347-355 or a catalytic domain thereofformulated for delivery by liposomes, nanoparticles, exosomes,microvesicles, nucleic acid nanoassemblies, a gene gun, or animplantable device.359. An engineered cell expressing the engineered adenosine deaminase ofany one of any one of statements 347-355 or a catalytic domain thereof.360. The engineered cell of statement 359, wherein the cell transientlyexpresses the engineered adenosine deaminase or the catalytic domainthereof.361. The engineered cell of statement 359 or 360, wherein the cellnon-transiently expresses the engineered adenosine deaminase or thecatalytic domain thereof.362. An engineered, non-naturally occurring system for modifyingnucleotides in a target nucleic acid, comprising: a) a dead CRISPR-Casor CRISPR-Cas nickase protein, or a nucleotide sequence encoding saiddead Cas or Cas nickase protein; b) a guide molecule comprising a guidesequence that hybridizes to a target sequence and designed to form acomplex with the dead CRISPR-Cas or CRISPR-Cas nickase protein; and c) anucleotide deaminase protein or catalytic domain thereof, or anucleotide sequence encoding said nucleotide deaminase protein orcatalytic domain thereof, wherein said nucleotide deaminase protein orcatalytic domain thereof is covalently or non-covalently linked to saiddead CRISPR-Cas or CRISPR-Cas nickase protein or said guide molecule isadapted to link thereof after delivery.363. The system of statement 362, wherein said adenosine deaminaseprotein or catalytic domain thereof comprises one or more of themutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I,I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418E, S661T based onamino acid sequence positions of hADAR2-D, and corresponding mutationsin a homologous ADAR protein.364. The system of statement 362 or 363, wherein said adenosinedeaminase protein or catalytic domain thereof comprises mutations:E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I,M383L, D619G, S582T, V440I, S495N, K418E, and S661T based on amino acidsequence positions of hADAR2-D, and corresponding mutations in ahomologous ADAR protein.365. The system of any one of statements 362 to 364, wherein theCRISPR-Cas protein is Cas9, Cas12, Cas13, Cas 14, CasX, CasY.366. The system of any one of statements 362 to 365, wherein theCRISPR-Cas protein is Cas13b.367. The system of any one of statements 362 to 366, wherein theCRISPR-Cas protein is Cas13b-t1, Cas13b-t2, or Cas13b-t3.368. The system of any one of statements 362 to 367, wherein theCRISPR-Cas is an engineered CRISPR-Cas protein of any one of statements1 to 318.369. A method for modifying nucleotide in a target nucleic acid,comprising: delivering to said target nucleic acid the engineeredadenosine deaminase of any one of statements 347-355, or the system ofany one of statements 362-368, wherein the deaminase deaminates anucleotide at one or more target loci on the target nucleic acid.370. The method of statement 369, wherein said nucleotide deaminaseprotein or catalytic domain thereof has been modified to increaseactivity against a DNA-RNA heteroduplex.371. The method of statement 369 or 370, wherein said nucleotidedeaminase protein or catalytic domain thereof has been modified toreduce off-target effects.372. The method of any one of statements 369 to 371, wherein the targetnucleic acid is within a cell.373. The method of statement 372, wherein said cell is a eukaryoticcell.374. The method of statement 372 or 373, wherein said cell is anon-human animal cell.375. The method of any one of statements 372 to 374, wherein said cellis a human cell.376. The method of any one of statements 372 to 375, wherein said cellis a plant cell.377. The method of any one of statements 369 to 376, wherein said targetnucleic acid is within an animal.378. The method of any one of statements 369 to 377, wherein said targetnucleic acid is within a plant.379. The method of any one of statements 369 to 378, wherein said targetnucleic acid is comprised in a DNA molecule in vitro.380. The method of any one of statements 369 to 379, wherein theengineered adenosine deaminase, or one or more components of the systemare delivered to the cell as a ribonucleoprotein complex.381. The method of statement 380, wherein the engineered adenosinedeaminase, or one or more components of the system are delivered via oneor more particles, one or more vesicles, or one or more viral vectors.382. The method of statement 381, wherein said one or more particlescomprise a lipid, a sugar, a metal or a protein.383. The method of statement 381 or 382, wherein said one or moreparticles comprise lipid nanoparticles.384. The method of any one of statements 381 to 383, wherein said one ormore vesicles comprise exosomes or liposomes.385. The method of any one of statements 381 to 384, wherein said one ormore viral vectors comprise one or more adenoviral vectors, one or morelentiviral vectors, or one or more adeno-associated viral vectors.386. The method of any one of statements 369 to 385, where said methodmodifies a cell, a cell line or an organism by manipulation of one ormore target sequences at genomic loci of interest.387. The method of statement 386, wherein said deamination of saidnucleotide at said target locus of interest remedies a disease caused bya G→A or C→T point mutation or a pathogenic SNP.388. The method of statement 387, wherein said disease is selected fromcancer, haemophilia, beta-thalassemia, Marfan syndrome andWiskott-Aldrich syndrome.389. The method of statement 386, 387, or 388, wherein said deaminationof said nucleotide at said target locus of interest remedies a diseasecaused by a T→C or A→G point mutation or a pathogenic SNP.390. The method of statement 389, wherein said deamination of saidnucleotide at said target locus of interest inactivates a target gene atsaid target locus.391. The method of any one of statements 380 to 390, wherein theengineered adenosine deaminase, or one or more components of the systemare delivered by liposomes, nanoparticles, exosomes, microvesicles,nucleic acid nanoassemblies, a gene gun, an implantable device, or thevector system of statement 302.392. The method of any one of statements 369 to 392, whereinmodification of the nucleotide modifies gene product encoded at thetarget locus or expression of the gene product.393. The engineered adenosine deaminase of any one of statements 347-355or the system of any one of statements 362-368, wherein the adenosineprotein or catalytic domain thereof comprises a mutation on S375 basedon amino acid sequence positions of hADAR2-D, and a correspondingmutation in a homologous ADAR protein.394. The engineered adenosine deaminase or the system of statement 393,wherein the mutation on S375 is S375N.395. The use of the engineered CRISPR-Cas protein or engineeredadenosine deaminase of any one of the preceding statements for thepreparation of a medicament for the treatment of a disease.396. A pharmaceutical formulation comprising the engineered CRISPR-Casprotein or engineered adenosine deaminase of any one of the precedingstatements for use as a medicament.

The invention is further described in the following examples, which donot limit the scope of the invention described in the claims.

EXAMPLES Example 1—Crystal Structure of Cas13b in Complex with crRNAMethods Protein Purification for Crystallization

PbuCas13b was expressed in a pET28 based vector with a twin-strep-sumotag fused at the N-terminal in chemically competent BL21 DE3 cellspurchased from New England Biolabs. Cells with the expression plasmidwere grown at 37 degrees to OD 0.2 then the temperature was switched to21 degrees. Growth was continued until OD 0.6 then induced with 5 μMIPTG. Cultures were grown for 18-20 hours, and then cells were harvestedby centrifugation at 5,000 rpm and frozen at −80° C. Frozen cell pastewas homogenized in Buffer A (500 mM Sodium Chloride, 50 mM Hepes pH 7.5,2 mM DTT) supplemented with benzonase and lysozyme. The cells werebroken by two passes through a microfluidizer at 20,000 psi and celldebris were separated from the soluble fraction by centrifugation at10,000 rpm. The soluble fraction was passed through Streptactin resin(GE life sciences) and washed with 10 column volumes of Buffer A,followed by 10 column volume of wash buffer (1 M Sodium chloride, 50 mMHepes 7.5, 2 mM DTT), and finally by 10 column volumes of CleavageBuffer (400 mM Sodium Chloride, 20 mM Hepes 7.5, 2 mM DTT). PbuCas13bwas eluted from the resin by addition of 5 mM desthiobiotin (Sigma),then cleaved overnight by sumo protease after being supplemented with 20mM DTT. After cleavage the protein was passed through a Heparin column,concentrated to 500 μL and passed over a superdex 200 column (GE lifesciences) equilibrated in storage buffer (500 mM Sodium Chloride, 10 mMHepes pH 7.0, 2 mM DTT). Peak fractions were pooled and concentrated toat least 20 mg/ml. Seleno-methionine protein was similarly purifiedexcept with 5 mM DTT being supplemented in each buffer. Protein wasquantified using Pierce reagent (Thermo).

Crystallization and Data Collection

RNA substrate was added to PbuCas13b protein at 2:1 molar ratio anddialyzed for 7 hours against dialysis buffer (50 mM Sodium Chloride, 10mM Hepes 7.0, 2 mM TCEP). Complexed PbuCas13b+RNA were diluted to 10mg/ml with dialysis buffer and set up at 20 degrees by hanging dropvapor diffusion against 165 mM Sodium Citrate pH 4.6, 5.5% PEG6000, and2 mM TCEP at varying drop ratios. Rod shaped crystals grew overnight andreached full size in 1-2 months. Crystals were transferred from the dropto cryo stabilization buffer (140 mM Sodium Citrate pH 4.6, 5% PEG6000,35% PEG400), soaked for up to 24 hours, then flash frozen in liquidnitrogen. Selenium crystals for phasing were grown in similar conditionssupplemented with 5 mM TCEP.

Native diffraction data from crystals of PbuCas13b and guide RNA werecollected at the Advanced Photon Source, Argonne National Labs onbeamlines 23-ID-BID, and anomalous data at the Diamond light source onbeamline 104. A small beam was used, either collimated (23-ID) orfocused (Diamond) to 20 microns, and multiple datasets were collectedalong the length of the crystal. Anomalous datasets were collected at0.97934 (peak), 0.97958 (inflection) and 0.97204 (remote) angstromwavelengths. Diffraction data were processed using XDS (1, 2) and scaledin aimless (3) implemented in autoPROC toolbox (4). The statistics aresummarized in Table 10 below.

TABLE 10 Data name PbuCas13b-Se-peak PbuCas13b-Se-inflectionPbuCas13b-Se-remote PbuCas13b-native Ligand in the structure CitrateCitrate Citrate Citrate Data collection Space group P2₁2₁2₁ P2₁2₁2₁P2₁2₁2₁ P2₁2₁2₁ Cell dimensions a, b, c (Å) 90.82, 124.65, 140.73 90.86124.72 140.77 90.88, 124.76, 140.79 90.86, 125.03, 140.57 α, β, γ (°)90, 90, 90 90, 90, 90 90, 90, 90 90, 90, 90 Wavelength (Å) 0.979340.97958 0.97204 1.03320 Resolution (Å) 140.73-2.32 (2.47-2.32)*140.77-2.35 (2.53-2.35)* 140.79-2.40 (2.62-2.40)* 93.42-1.97(2.07-1.97)* Unique reflections 59325 55600 50274 111373 R_(sym) 0.203(1.732)* 0.207 (1.688)* 0.214 (1.741)* 0.245 (2.754)* I/σ(I) 10.2 (1.4)*10.4 (1.5)* 10.1 (1.5)* 10.4 (1.8)* CC1/2 0.996 (0.7)* 0.996 (0.710)*0.996 (0.706)* 0.995 (0.572)* Completeness (%) 94.2 (55.8)* 94.0 (56.6)*93.8 (52.0)* 97.8 (99.2) Redundancy 13.6 (12.3)* 13.5 (12.5)* 13.5(12.7)* 13.3 (14.1)* Refinement R_(work)/R_(free)** 0.1700/0.2023 No.atoms Protein 9111 Ligands 41 Water 657 B-factors (Å²) Protein 39.06Ligands 58.91 Water 40.20 R.m.s deviations Bond lenghts (Å) 0.005 Bondangles (°) 0.742 Ramachandran analysis^(#) (%) Favored 97.01 Allowed2.70 Outliers 0.29 *Highest resolution shell is shown in parenthesis.**Rfree was calculated with 5% of the data. ^(#)Distribution of dihedralangles in Ramachandran diagram were calculated with MolProbity program(1).Reference: 1. V. B. Chen et al., MolProbity: all-atom structurevalidation for macromolecular crystallography. Acta CrystallographicaSection D-Biological Crystallography 66, 12-21 (2010).

Structure Solution

The crystal structure of PbuCas13b was solved by multiwavelengthanomalous diffraction (MAD) using selenium as anomalous scattering. Theposition of 27 SeMet sites were determined and refined usingphenix.autosol (5, 6). A partial model was built by phenix.autobuild (7)using a 3.5 Å resolution experimental map with a figure of merit of0.35. Cycles of manual rebuilding in Coot (8, 9) and refinement inphenix.refine (10-12) were done using the selenium experimental map.R-free flags and experimental phases were transferred from the seleniumdata to high-resolution native data using reflection file editor inPHENIX. These reflections were used for further cycles of rebuilding inCoot and refinement in phenix.refine. Anomalous difference maps wereused to ensure correct registry. Refinement in phenix.refine used TLS(translation, libration, and screw), and positional and individualB-factor refinement. Citrate restrains were generated by phenix.elbow(13). The final model contains one polypeptide chain, one RNA nucleotidechain, two citrates molecules, one tetraethylene glycol (PG4) molecule,two Cl atoms, and 657 water molecules. Figures were created with PyMolSoftware (14).

Structure Analysis

RNA structure was analyzed using DSSR (15). Protein conservation mappingto the structure was done using the Consurf server (16). Proteinsecondary structure was analyzed using the PDBSUM webserver (17). APBSas part of the PyMol visualization program was used to calculateelectrostatics (18).

Protein Alignment

Alignments of Cas13b enzymes were done using ClustalW or Muscle asimplemented in Geneious(19). Neighbor-joining trees were generated usinga Jukes-Cantor distance model. Conservation alignments for structureanalysis were done on a tree subgroup that successfully matched HEPNdomain active site residues to other family members (FIGS. 14-16).

Gel Filtration Experiments

Formation of guide complex: 100 μg of PbuCas13b was incubated with twomolar equivalents of guide RNA for 20 minutes at room temperature, in100 μL of buffer (125 mM NaCl, 10 mM HEPES pH 7.0, 2 mM TCEP). Formationof guide-target complex: 100 μg of PbuCas13b and two molar equivalentsof guide RNA were incubated together for 20 minutes as above. Two molarequivalents of target RNA were then added to the solution and themixture was incubated at room temperature for an additional 20 minutes(100 μL total, 125 mM NaCl, 10 mM HEPES pH 7.0, 2 mM TCEP). Apo proteinwas similarly diluted to 1 μg/μL in a buffer solution of 125 mM NaCl, 10mM HEPES pH 7.0, 2 mM TCEP. Samples were injected from a 2 mL capillaryloop onto an GE Superdex 200 Increase 10/300 GL column and run with 500mM NaCl, 10 mM HEPES pH 7.0, 2 mM DTT buffer.

ThermoFluor Melting Assay

Protocol was adapted from (20). Samples were prepared to a final volumeof 20 with 1 μg of PbuCas13b (apo, guide, or guide-target complex, asprepared above) in a solution with a final concentration of 50 mM NaCl,10 mM HEPES pH 7.0, 6.25×SYPRO™ Orange Dye. For MgCl2 cleavage andbinding experiments, a final concentration of 6 mM Mg2+ was added to thebuffer mix described. For control experiments with non-complementaryRNA, 2 molar equivalents of RNA were incubated with the protein complex.Melting experiments were conducted in triplicate on a Roche LightCycler480 II.

Limited Proteolysis

10 μg of PbuCas13b was incubated with crRNA or crRNA and target for 30min at room temperature. 400 μg of protease (Trypsin, Chemotrypsin orPepsin) was added and the mix was incubated for 5 min at 37 degreescelsius, then placed quickly on ice for 2 min before adding SDS loadingbuffer and running on a 4-12% acrylamide gel.

Protein Expression and Purification of PbuCas13b Pre-crRNA ProcessingMutants

Alanine mutants at each of the putative crRNA-processing catalyticresidues were generated using PIPE-site-directed mutagenesis cloningfrom the TwinStrep-SUMO-PbuCas13b expression plasmid and transformedinto BL21(DE3)pLysE E coli cells. For each mutant, 2 L of Terrific Brothmedia (12 g/L tryptone, 24 g/L yeast extract, 9.4 g/L K2HPO, 2.2 g/LKH2PO4), supplemented with 100 μg/mL ampicillin, was inoculated with 15mL of overnight starter culture and grown until OD600 0.4-0.6. Proteinexpression was induced with the addition of 0.5 mM IPTG and carried outfor 16 hours at 21° C. with 250 RPM shaking speed. Cells were collectedby centrifugation at 5,000 RPM for 10 minutes and paste was directlyused for protein purification (10-20 g total cell paste). For Lysis,bacterial paste was resuspended via stirring at 4° C. in 50 mL of lysisbuffer (50 mM Tris-HCl pH 7.5, 500 mM NaCl, 1 mM DTT) supplemented with50 mg Lysozyme, 1 tablet of protease inhibitors (cOmplete, EDTA-free,Roche Diagnostics Corporation) and 500 U of Benzonase (Sigma). Thesuspension was passed through a LM20 microfluidizer at 25,000 psi andlysate cleared by centrifugation at 10,000 RPM, 4° C. for 1 hour. Lysatewas incubated with 2 mL of StrepTactin superflow resin (Qiagen) for 2hours at 4° C. on a rotary shaker. Resin bound with protein was washedthree times with 10 mL of lysis buffer, followed by addition of 50 μLSUMO protease (inhouse) in 20 mL of IGEPAL lysis buffer (0.2% IGEPAL).Cleavage of the SUMO tag and release of native protein was carried outovernight at 4° C. in Econo-column chromatography column under gentlemixing on a table shaker. Cleaved protein was collected as flow-through,washed three times with 5 mL of lysis buffer and checked on a SDS-PAGEgel.

Protein was diluted two-fold with ion exchange buffer A containing nosalt (50 mM Tris-HCl pH 7.5, 1 mM DTT) to get the starting NaClconcentration of 250 mM. Protein was then loaded onto a 5 mL Heparin HPcolumn (GE Healthcare Life Sciences) and eluted over a NaCl gradientfrom 250 mM to 1 M. Fraction of eluted protein (at roughly 700 mM) wereanalyzed by SDS-PAGE gel and coomassie staining, pooled and concentratedto 1 mL using 50 MWCO centrifugal filters (Amicon). Concentrated proteinwas loaded onto a pre-equilibrated size exclusion column and elutedusing S200 buffer containing 50 mM Tris-HCl pH 7.5, 500 mM NaCl, 2 mMDTT. Monodisperse protein fractions were analyzed by SDS-PAGE gel andcoomassie staining, following by concentrating and buffer exchange intoprotein storage buffer (600 mM NaCl, 50 mM Tris-HCl pH 7.5, 1 mM DTT).

Pre-crRNA Processing Assays

RNA for pre-crRNA processing and nuclease assays were ordered asUltramers (IDT) and in vitro transcribed using the HiScribe T7 QuickHigh Yield RNA Synthesis kit (New England Biolabs). RNA was purifiedwith AmpureXP RNA clean up beads and stored at −20° C. for further use.For testing pre-crRNA processing, WT and mutant protein were incubatedwith pre-crRNA at four times molar excess of protein relative to theRNA. Pre-crRNA processing was carried out in Cas13b crRNA processingbuffer (10 mM TrisHCl pH 7.5, 50 mM NaCl, 0.5 mM MgCl2, 20U SUPERase in(ThermoFisher Scientific), 0.1% BSA) for 30 minutes at 37° C., stoppedby adding 2×TBE-Urea gel loading buffer and denatured for 5 minutes at95° C. Samples were immediately put on ice for 10 minutes before runningthem on an 15% TBE-Urea gel in 1×TBE buffer at 200 V for 40 minutes. Gelstaining was carried out in 1×Sybr Gold in 1×TBE for 15 minutes andimaged on a BioRad gel doc system.

Fluorescent Collateral RNA-Cleavage Assay for Pre-crRNA Mutants

Detection assays were carried out as quadruplicates with equimolarratios of PbuCas13b or PbuCas13b mutants, crRNA and RNA target, innuclease assay buffer (20 mM HEPES, 60 mM NaCl, 6 mM MgCl2, pH 6.8) with0.5 μL murine RNase inhibitor (New England Biolabs) and 125 nM of poly-Uhomopolymer RNA sensor (Trilink). Samples were incubated for 3 hours at37° C. on a fluorescent plate reader equipped with a FAM filter set.Measurements were recorded at 5-minute intervals and data normalized tothe first time-point.

Cleavage Fragment Library

To map Cas13 cleavage products, in vitro cleavage reactions wereperformed as described above with LwCas13a and PbuCas13b, theirrespective crRNAs and target RNA or control. Cleavage was carried outfor 5 or 30 minutes and purified using an RNA oligo clean andconcentrator kit (Zymo research). Small RNA sequencing libraries wereprepared according to the NEB Multiplex Small RNA sequencing kitsequenced on an Illumina NextSeq 500 instrument.

Design and Cloning of Mammalian Constructs for RNA Editing

PguCas13b was made catalytically inactive (dPguCas13b) by mutating twoarginine and two histidine residues in the catalytic sites of the HEPNdomains to alanines (R146A/H151A/R1116A/H1121A). These catalyticallyinactivated Cas13bs were Gibson cloned into pcDNA-CMV vector backbonescontaining the deaminase domain of ADAR2 (E488Q) fused to the C terminalend of the Cas13b via a GS linker (21). To generate truncated versions,primers were designed to PCR amplify the dCas13b that truncated off 60bp (20 amino acids) progressively up to 900 bp off of the C terminal end(15 truncations in total), and these truncated Cas13b genes were Gibsoncloned into the pcDNA-CMV-ADAR2 backbone described above. Guide RNAstargeting Cluc were cloned using golden gate cloning into a mammalianexpression vector containing the direct repeat sequence for thisortholog at the 3′ end of the spacer sequence destination site, underthe U6 promoter.

The luciferase reporter used was a CMV-Cluc (W85X) EF1alpha-Gluc dualluciferase reporter used by Cox et. al. (2017) to measure RNA editing(21). This reporter vector expresses functional Gluc as a normalizationcontrol, but a defective Cluc due to the addition of the W85Xpretermination site.

Mammalian Cell Culture

Mammalian cell culture experiments were performed in the HEK293FT line(American Type Culture Collection (ATCC)), which was grown in Dulbecco'sModified Eagle Medium with high glucose, sodium pyruvate, and GlutaMAX(Thermo Fisher Scientific), additionally supplemented with 1×penicillin—streptomycin (Thermo Fisher Scientific) and 10% fetal bovineserum (VWR Seradigm).

All transfections were performed with Lipofectamine 2000 (Thermo FisherScientific) in 96-well plates. Cells were plated at approximately 20,000cells/well 16-18 hours prior to transfection to ensure 90% confluency atthe time of transfection. For each well on the plate, transfectionplasmids were combined with Opti-MEM I Reduced Serum Medium (ThermoFisher) to a total of 25 μl. Separately, 24.5 μl of Opti-MEM wascombined with 0.5 μl of Lipofectamine 2000. Plasmid and Lipofectaminesolutions were then mixed and pipetted onto cells.

RNA Knockdown in Mammalian Cells

To assess RNA targeting in mammalian cells with reporter constructs, 150ng of Cas13 construct was co-transfected with 300 ng of guide expressionplasmid and 45 ng of the dual luciferase reporter construct. 48 hourspost-transfection, media containing secreted luciferase was harvested,and measured for activity with BioLux Cypridinia and Biolux Gaussialuciferase assay kits (New England Biolabs) on a plate reader (BiotekSynergy H4) with an injection protocol. Signal from the targeted Glucwas normalized to signal from un-targeted Cluc, and subsequently,experiments with PbCas13b mutant luciferase signal were normalized toexperiments with guide-only luciferase signal (the average of threebioreplicates). All replicates performed are biological replicates.

REPAIR Editing in Mammalian Cells

To assess REPAIR activity in mammalian cells, Applicants transfected 150ng of REPAIR vector, 300 ng of guide expression plasmid, and 45 ng ofthe RNA editing reporter. Applicants then harvested media with thesecreted luciferase after 48 hours and diluted the media 1:10 inDulbecco's phosphate buffered saline (PBS) (10 μl of media into 90 μlPBS). Applicants measured luciferase activity with BioLux Cypridinia andBiolux Gaussia luciferase assay kits (New England Biolabs) on a platereader (Biotek Synergy Neo2) with an injection protocol. All replicatesperformed are biological replicates.

REFERENCES

-   1. W. Kabsch, Integration, scaling, space-group assignment and    post-refinement. Acta Crystallogr D Biol Crystallogr 66, 133-144    (2010).-   2. W. Kabsch, Xds. Acta Crystallogr D Biol Crystallogr 66, 125-132    (2010).-   3. P. R. Evans, G. N. Murshudov, How good are my data and what is    the resolution? Acta Crystallogr D Biol Crystallogr 69, 1204-1214    (2013).-   4. C. Vonrhein et al., Data processing and analysis with the    autoPROC toolbox. Acta Crystallogr D Biol Crystallogr 67, 293-302    (2011).-   5. P. D. Adams et al., PHENIX: a comprehensive Python-based system    for macromolecular structure solution. Acta Crystallogr D Biol    Crystallogr 66, 213-221 (2010).-   6. T. C. Terwilliger et al., Decision-making in structure solution    using Bayesian estimates of map quality: the PHENIX AutoSol wizard.    Acta Crystallogr D Biol Crystallogr 65, 582-601 (2009).-   7. T. C. Terwilliger et al., Iterative model building, structure    refinement and density modification with the PHENIX AutoBuild    wizard. Acta Crystallogr D Biol Crystallogr 64, 61-69 (2008).-   8. P. Emsley, B. Lohkamp, W. G. Scott, K. Cowtan, Features and    development of Coot. Acta Crystallogr D Biol Crystallogr 66, 486-501    (2010).-   9. P. Emsley, K. Cowtan, Coot: model-building tools for molecular    graphics. Acta Crystallogr D Biol Crystallogr 60, 2126-2132 (2004).-   10. P. V. Afonine et al., Towards automated crystallographic    structure refinement with phenix.refine. Acta Crystallogr D Biol    Crystallogr 68, 352-367 (2012).-   11. N. Echols et al., Automated identification of elemental ions in    macromolecular crystal structures. Acta Crystallogr D Biol    Crystallogr 70, 1104-1114 (2014).-   12. P. H. Zwart et al., Automated structure solution with the PHENIX    suite. Methods Mol Biol 426, 419-435 (2008).-   13. N. W. Moriarty, R. W. Grosse-Kunstleve, P. D. Adams, electronic    Ligand Builder and Optimization Workbench (eLBOW): a tool for ligand    coordinate and restraint generation. Acta Crystallogr D Biol    Crystallogr 65, 1074-1080 (2009).-   14. The PyMOL Molecular Graphics System, Version 2.0 Schrödinger,    LLC.-   15. X. J. Lu, H. J. Bussemaker, W. K. Olson, DSSR: an integrated    software tool for dissecting the spatial structure of RNA. Nucleic    Acids Res 43, e142 (2015).-   16. H. Ashkenazy et al., ConSurf 2016: an improved methodology to    estimate and visualize evolutionary conservation in macromolecules.    Nucleic Acids Res 44, W344-350 (2016).-   17. T. A. de Beer, K. Berka, J. M. Thornton, R. A. Laskowski, PDBsum    additions. Nucleic Acids Res 42, D292-296 (2014).-   18. E. Jurrus et al., Improvements to the APBS biomolecular    solvation software suite. Protein Sci 27, 112-128 (2018).-   19. L. A. Ripma, M. G. Simpson, K. Hasenstab-Lehman, Geneious!    Simplified genome skimming methods for phylogenetic systematic    studies: A case study in Oreocarya (Boraginaceae). Appl Plant Sci 2,    (2014).-   20. K. Huynh, C. L. Partch, Analysis of protein stability and ligand    interactions by thermal shift assay. Curr Protoc Protein Sci 79, 28    29 21-14 (2015).-   21. D. B. T. Cox et al., RNA editing with CRISPR-Cas13. Science 358,    1019-1027 (2017).

Results

Type VI CRISPR-Cas systems contain programmable single-effectorRNA-guided RNases, including Cas13b, one of the four known type VIsubtype family members. Cas13b is unique among these protein families inits linear domain architecture and CRISPR RNA (crRNA) structure.Applicants report the crystal structure of Prevotella buccae Cas13b(PbuCas13b) bound to crRNA at 1.97 angstrom resolution. The structurereveals that the guide RNA was coordinated within Cas13b by a network ofdirect and indirect interactions that mediated nuclease activity.Applicants identified a second active site for crRNA processing and showthat mutation of key residues in this site abrogates processingactivity. Applicants also found the HEPN2 nuclease domain wasnon-essential for RNA targeting and established a basis forstructure-guided engineering of RNA targeting with Cas13b.

Here Applicants report the structure of Cas13b from Prevotella buccae(PbuCas13b) in complex with a crRNA handle and partial spacer at 1.97angstrom resolution. Our structure revealed the overall architecture ofCas13b nucleases and the molecular basis for crRNA recognition andcleavage.

Applicants solved the crystal structure of PbuCas13b complexed with a36-nucleotide direct repeat sequence and a short 5-nucleotide spacer(FIG. 1). Similar to other Class 2 CRISPR effectors, the overall shapeof PbuCas13b is bilobed (13-19). Five domains are apparent within thestructure: two HEPN domains (HEPN1 and HEPN2), two predominantly helicaldomains (Helical-1 and Helical-2), and a domain that caps the 3′ end ofthe crRNA with two beta hairpins (Lid domain) (FIG. 1, FIG. 18). Toidentify similarities to other domains in the protein data bank, thecomplete PbuCas13b structure as well as isolated domains were queriedusing the DALI server (15). HEPN1 matched to the HEPN2 domain ofLshCas13a.

Both HEPN domains were largely alpha helical: HEPN1 was made of twelvelinearly connected α-helices with flexible loops in between the helices.HEPN2 was composed of nine α-helices, several short β-strands, and aβ-hairpin with charged residues at the tip, which pointed towards theactive site pocket. HEPN2 rested on HEPN1 such that the active siteresidues (R156, N157, H161 and R1068, N1069, H1073) were assembled intoa canonical HEPN active site, despite being at the N- and C-terminalextremities of the linear protein (FIG. 1) (3, 17, 18, 20). The HEPN1domain was connected to the Helical-1 domain by a highly conservedinter-domain linker (IDL) that reached across the center of a large,positively charged inner channel (FIG. 1). Mutation of conservedresidues of the IDL (R285, K292, E296) to alanine reduces the ability ofPbuCas13b to interfere with luciferase expression in mammalian cells bycleaving luciferase mRNA, demonstrating a role in general nucleaseactivity (FIGS. 5A,C).

Helical-1 was broken up linearly into three segments by the Helical-2and Lid domains. Helical-1 made extensive sugar-phosphate and nucleobasecontacted with the direct repeat RNA (FIG. 2, FIG. 3). Helical-1 alsomade minor interface contacts with both HEPN1/2 and the Lid domains. TheLid domain was mixed α and β secondary structure and caps the 3′ freeend of the direct repeat RNA with two charged β-hairpins. The longer ofthe two β-hairpins reached across the RNA loop to contact the Helical-1domain, forming a lid over the free RNA ends. Positively chargedresidues from the Lid domain pointed into a large central channelrunning through the center of the protein complex (FIG. 1, FIG. 6C). Apositively charged side channel penetrating from the outer solvent tothe inner channel was formed between a disordered loop (K431 to T438) ofthe Lid domain and the two HEPN domains (FIG. 6A). The Helical-2 domainwas made of eleven α-helices and wrapped under the body of the directrepeat RNA via its connection to Helical-1. Helical-2 interfacedextensively with the HEPN1 domain and made minor contacts with theextended β-hairpin of the Lid domain. A second positively charged sidechannel was between Helical-1 and Helical-2, providing bulk solventaccessibility to the crRNA (FIG. 6B). All domains, the IDL, and thecrRNA formed the large central channel, the inside of which was linedwith positively charged residues (FIG. 6C).

Nuclease-dead Cas13b fused to an ADAR deaminase domain was used forREPAIR to achieve targeted RNA base editing (11). AAV-mediated deliveryis commonly used for gene therapies, but REPAIR exceeds the size limitof AAV's cargo capacity (11, 21). Applicants showed previously thatC-terminal truncations of Prevotella sp. P5-125 (PspCas13b) did notdecrease REPAIR activity. Applicants further used another ortholog ofCas13b, from Porphyromonas gulae (PguCas13b), which was stably expressedand showed high activity in mammalian cells, in contrast to PbuCas13b(11). Based on alignments between PbuCas13b and PguCas13b, Applicantsmade truncations to remove the HEPN2 domain, fused it to ADAR, andtested its ability to carry out base editing with the REPAIR system.Surprisingly, not only did these truncated mutants retain RNA targeting,some were significantly more efficient at RNA editing (FIG. 7).

Cas13b has been shown to function efficiently in REPAIR with crRNAs ofvarious lengths, with spacers ranging from 30 to 84 nucleotides (11).Unambiguous density for all RNA bases enabled complete model building ofthe direct repeat RNA. The structure revealed Cas13b recognized thedirect repeat by extensive sugar-phosphate and nucleobase interactions(FIGS. 2 and 3). The direct repeat was mostly buried between the twoHelical domains and the Lid domain but protruded slightly fromHelical-1, explaining how Cas13b was able to utilize an alternate,longer crRNA. The overall crRNA structure was a deformed A-form duplexcomprising a stem (bases G(−1)-G(−4), C(−33)-C(−36)), loop (C(−5)-U(−8),A(−29)-A(−32)), stem (U(−9)-U(−14), A(−23)-A(−28)), bulge (C(−15),G(−21)), and hairpin loop (U(−16)-U(−20)) architecture (FIGS. 2 and 3).Helical-1 and Helical-2 mediated direct and indirect recognition of thecrRNA hairpin together with the Lid domain, which capsped the 3′ freeend.

Three bases, C(−8), U(−20), and A(−29), were flipped out from the bodyof the RNA. The backbone carbonyl of T754 stabilized the flipped out,highly conserved C(−8) base by interacting with the base N4 amine,holding the base in a hydrophobic pocket of highly conserved residues(Y540, 566-571, K751, 753-761) in the Helical-1 and Helical-2 domains.The base flip was further stabilized by interaction between the C(−8)N3′ and the sugar (O2′) of U(−7). Changing C(−8) to G or U decreasednuclease activity, and destabilized the protein-RNA complex as measuredin a thermal stability assay (FIG. 3). U(−20) was also absolutelyconserved in Cas13b direct repeat sequences and was coordinated bycompletely conserved residues, most notably R762 which made contactswith the nucleobase O2, and R874 which intercalated between G(−21) andU(−20), holding the base out and making contacts with the U(−20) sugarO4′. Mutation of R762 to alanine dramatically reduced RNA interferencein mammalian cells (FIG. 5A). Mutating U(−20) to G decreased nucleaseactivity (FIG. 3). In contrast to C(−8) and U(−20), A(−29) was notconserved in Cas13b direct repeat sequences and was the nucleobase wasnot coordinated by any amino acids. Instead, A(−29) engaged inmultiplete base pairing with G(−26) and C(−11) (FIG. 8F) (22). A(−29)was tolerant to identity changes to any other base, but mutation to Gslightly decreased general nuclease activity (FIG. 9). Base identitychanges that affected general nuclease activity also decreased thethermal stability of the Cas13+crRNA complex. Consistent with thisobservation, Applicants found that changing the wobble base pair betweenU(−27) and G(−10) to a Watson crick base pair increased general nucleaseactivity (FIG. 2D). However, changing A(−32) to G, which also created aWatson crick base pair, decreased stability and reduced RNase activity(FIG. 2D).

The hairpin loop was recognized by a network of protein interactionsfrom highly conserved residues within the Helical-2 domain (FIG. 3).K870 coordinated with O4 from both U(−16) and U(−19), which indirectlyflipped U(−17) into the solvent at the hairpin turn, with no visibleresidue contacts. W842 stacked with the nucleobase of U(−18) while alsointeracting with the phosphate backbone together with K846. R877 andE873 further stabilized U(−18) through interactions with base N3 and O2positions. R874 and R762 stabilized the U20 position through sugar O4′and base O2′ interactions, respectively.

The hairpin loop distal end of the crRNA (−1 to −4 and −33 to −36) washelical and recognized by a combination of base and backboneinteractions (FIG. 3). Notably, N653 and N652 made critical minor groovedirect contacts with U(−2) and C(−36) and coordinated the 5′ and 3′ endsof the hairpin. Disruption of these base identities or mutation of N653or N652 to alanine substantially decreased Cas13b activity in vitro andin mammalian interference assays (FIG. 2E, FIG. 5). C(−33) wascoordinated by N756 via the nucleobase O2 and sugar O2′, and changingthis C to A or G abrogates general RNase activity and decreased proteinstability (FIG. 2D, FIG. 9).

The RNA hairpin end (nucleotides −17 to −20) was stabilized by extensivephosphate backbone hydrogen bonding and base interactions (FIGS. 2, 3).Mutating U(−18) to G abolished general nuclease activity. The same wasobserved for U(−19), or U(−20) but other bases were tolerated,suggesting that the G O6 or N2 nucleobase atoms disrupted nucleaseactivity (FIG. 9).

The crystallized RNA substrate included five bases of a spacer sequence(U1-G5), though only the first nucleotide 5′ of the direct repeat wasvisible in the density. The 5′ end of the RNA direct repeat and thefirst base of the spacer was supported by residues from Helical-2 andpointing up into the central channel and towards the side channelbetween the Lid and HEPN domains (FIG. 3). U(1) was not coordinated bybase specific contacts, but was in a net positively charged pocket inthe Lid domain. Mutation of charged and aromatic amino acids nearby thespacer U(1) had little effect on general nuclease activity, suggestingthe spacer RNA coordination by these residues is either not present ornot essential (FIG. 5H).

Some Class ₂ CRISPR systems process long pre-crRNAs into mature crRNAs(3, 7, 11, 23). Cas13b has been shown to process its own crRNA at the 3′end (3). A number of highly conserved residues are in contact with ornearby the 3′ end of the RNA and potentially form a second, non-HEPNnuclease site. To test for a second nuclease site, Applicants mutatedfour conserved residues nearby the 3′ RNA end and tested these mutantsfor crRNA processing and target-activated nuclease activity (FIG. 2).K393 when mutated to alanine abrogates RNA processing but retainstargeted nuclease activity, confirming the location of a second nucleasesite in the Lid domain responsible for crRNA processing (FIGS. 2, 6,10). R482A slightly affected crRNA processing, but significantlyaffected general nuclease activity. This is likely due to the importanceof stabilizing the crRNA (FIG. 2).

The resolved spacer nucleotides pointed toward the HEPN lobes and intothe positively charged channel. However, the channel was not largeenough to accommodate an RNA duplex, suggesting that Cas13b adopted anopen conformation in response to target binding. Applicants measuredchanges in Cas13b conformation in apo, guide, and guide+target RNAcomplexes using a thermal denaturation assay. Target-bound Cas13badopted a less stable conformation compared to guide-only Cas13b, butthis change was not observed in the presence of non-target RNA (FIG.11). Limited proteolysis gave similar results; guide+target boundcomplexes were less protease resistant than guide only complex (FIG.12).

Although there was a single molecule in the asymmetric unit of thecrystal, a loop from one monomer made trans contacts with the another,coordinating a bound citrate from the crystallization buffer in theactive site. To test if the trans-subunit contact is functional, andwhether PbuCas13b functions cooperatively in trans via this loop,Applicants mutated the residues at the tip of this loop (Q646 and N647)to see if they would affect activity. Mutations of each decreased RNAinterference in mammalian cells, suggesting the possibility oftrans-subunit regulation of general nuclease activity (FIG. 5F).

Lastly, Applicants compared Cas13b to the structure of LshCas13a (FIG.4) (17). In addition to general functional similarities between thesefamily members, there were structural similarities between nucleasesespecially in the HEPN domains and active site architecture (FIGS.4B,C). However, a SAS search provided a match to the crystal structureof (previously referred to as LbCpf1) and highlighted a bridge helixlike sub-domain within Cas13b (24). Although this domain was poorlyconserved within the Cas13b family, it appeared to be a commonstructural feature with Cas12a that mediated essential nucleic acidcontacts (FIGS. 5D, 13). Given the fundamental differences betweenCas13b and Cas12a, Applicants postulated that the bridge helix aroseconvergently and did not indicate a common ancestor for these twoproteins. Nonetheless, Applicants referred to this feature as the bridgehelix for consistency with the nomenclature of other Class 2 effectors(1, 14).

Table 11 below lists exemplary PbCas13b mutants which were produced andtested.

TABLE 11 List of mutations tested for RNA interference. List ofmutations and averaged normalized fluorescent values from threebiological replicates. Mutation Guide 1 normalized RLU Guide 2normalized RLU R53A 0.747858 0.618255 R53K 0.533437 0.415443 R53D0.708809 0.656473 R53E 0.653859 0.505983 Y164A 0.560344 0.423418 Y164F0.555361 0.419603 Y164W 0.578905 0.411809 K183A 0.611807 0.434156 K193A0.637075 0.435537 R285A 0.621679 0.473138 K292A 0.709821 0.47966 E296A0.753062 0.402674 N297A 0.697938 0.407599 T405A 0.668786 0.366621 H407A0.541401 0.358297 H407Y 0.503637 0.335036 H407W 0.528546 0.359063 H407F0.495551 0.341844 K457A 0.632984 0.441894 H500A 0.549885 0.34935 K570A0.575468 0.362485 K590A 0.587262 0.383572 R600A 0.565624 0.417064 K607A0.687397 0.430726 R614A 0.744806 0.450827 N634A 0.661617 0.386325 R638A0.696471 0.410163 Q646A 0.675146 0.372062 N647A 0.677884 0.400227 N652A0.665943 0.406755 N653A 0.650794 0.384887 K655A 0.900461 0.58631 S658A0.625349 0.363679 K741A 0.648908 0.401644 K744A 0.651516 0.42232 N756A0.650638 0.447333 S757A 0.618393 0.402225 R762A 0.862666 0.577076 R791A0.644193 0.444169 K826A 0.600621 0.395086 K828A 0.619022 0.416127 K829A0.593882 0.404272 K846A 0.576794 0.407463 K857A 0.595231 0.40528 R877A0.683229 0.461362 K943A 0.573508 0.394379 K943R 0.60418 0.403167 K943D0.69041 0.386955 K943E 0.702192 0.372508 R1041A 0.662393 0.371243 R1041K0.629986 0.376016 R1041D 0.813623 0.680736 R1041E 0.842593 0.484389D397A 0.63295 0.376658 E398A 0.557761 0.365275 D399A 0.590279 0.368724E400A 0.560351 0.349016 D434A 0.524497 0.364659 R618A 0.611197 0.401871R830A 0.663284 0.405163 Q831A 0.548391 0.351777 K835A 0.503864 0.373504K836A 0.503571 0.374439 R838A 0.549749 0.372399 WT 0.563282 0.335038

The structure of PbuCas13b provided new information on the structuraldiversity of the type VI protein family and highlighted the differencesand similarities between Cas13a and b. Applicants show the structuralbasis for crRNA recognition and processing and revealed key regulatorsof nuclease activity in both the guide RNA and protein. Based on thestructure of PbuCas13b, Applicants were able to generate a smallervariant of the REPAIR platform that maintained base editing efficiencyand could be packaged into AAV. Our data suggests a major domainreconfiguration occurs during target recognition. Insights from thestructure of PbuCas13b enabled rational engineering to improvefunctionality for RNA targeting specificity, base editing, and nucleicacid detection (11, 12, 25, 26).

Example 2

FIG. 19 shows a pymol file that shows a position of the coordinatednucleotide in the active site of Cas13b. This is a structural alignmentbased on a crystal structure of RNAseL in complex with U nucleotide.This alignment placed the nucleotide within the active site of Cas13band revealed likely residue interactions. Loops involved in basespecificity are annotated in the figure.

Example 3

The RNA loop may be extended. The extended RNA guide loop may addfunctional RNA motifs. FIG. 20 shows an exemplary RNA loop extension.

Example 4

FIG. 21 shows exemplary fusion points via which a nucleotide deaminaseis linked to a Cas13b. The fusion points may be one or more amino acidson Cas13b. For example, the fusion points may be one or more of aminoacids 411-429, 114-124, 197-241, and 607-624. In one example, the aminoacids are in Prevotella buccae Cas13b.

Example 5

Mutations in ADAR affecting ADAR activity were screened using yeastscreening. The screen was performed in multiple rounds. Each round ofscreening yielded a set of candidate mutations. The candidate mutationswere then validated in mammalian cells. The top-performing mutationswere added to the last version of mutations and re-screened. Themutations screened in 10 rounds are shown in the table below. The mutantidentified in round n was designated as “RESCUE vn-1.” As discussedherein RESCUE refer to mutations that convert adenosine deaminaseactivity to cytidine deaminase activity.

TABLE 12 RESCUE Round ADAR mutations Plasmid RESCUEv0 E488Q pAB0048RESCUEv1 E488Q, V351G pAB0359 RESCUEv2 E488Q, V351G, S486A pAB1188RESCUEv3 E488Q, V351G, S486A, T375S pAB0642 RESCUEv4 E488Q, V351G,S486A, T375S, S370C, pAB1072 RESCUEv5 E488Q, V351G, S486A, T375S, S370C,pAB1135 P462A RESCUEv6 E488Q, V351G, S486A, T375S, S370C, pAB1146 P462A,N597I RESCUEv7 E488Q, V351G, S486A, T375S, S370C, pAB1194 P462A, N597I,L332I RESCUEv8 E488Q, V351G, S486A, T375S, S370C, pAB1220 P462A, N597I,L332I, I398V RESCUEv9 E488Q, V351G, S486A, T375S, S370C, pAB1327 P462A,N597I, L332I, I398V, K350I

Screening for mutations for RESCUE v9 was performed (FIG. 22). Effectsof RESCUEv9 were validated on T-flip guides (FIG. 23) and C-flip guides(FIG. 24). At least about 60% editing for T, A, and C motifs and 25%editing for the G motif were achieved with RESCUEv9. Performance ofRESCUEv9 was tested with endogenous targeting (with T-flip guides) (FIG.25).

Screening for mutations for RESCUE v10 was performed (FIG. 26).

30-bp guides were tested for C-flips (FIG. 27).

Comparison between Cas13b6 and Cas13b12 with RESCUE v1 through v8 wereperformed. Gluc/Cluc results are shown in FIG. 28, fraction editingresults are shown in FIG. 29, and effects on endogenous targeting(T-flips) with RESCUEv8 are shown in FIG. 30.

Effects of RESCUEs on base converting (C to U and A to I activities)were compared (FIG. 31). CCN 3′ motif targeting was tested (FIG. 32).

Example 6

Constructs with various dead Cas13b (including dCas13b) fused with ADARvia a linker were generated (FIG. 33A) and tested (FIG. 33B). Theconstructs also had an N-terminal tag (HIVNES). Sequencing of theN-terminal tag and linkers were performed (FIG. 34).

Quantification of off-targets was performed (FIG. 35). Off-target editswere tested (FIG. 36). Endogenous genes targeted with (GGS)2/Q507R weretested (FIG. 37). The eGFP screening of mutations on (GGS)2/Q507R wasperformed (FIGS. 38 and 39).

Constructs with dCas13b that was Cas13b truncation were generated (FIG.40A) and tested (FIG. 40B). The constructs also had an N-terminal tag(NES/NLS). Multiplexed on/off-target guides were generated for screening(FIG. 41).

Example 7

Mutations in ADAR affecting ADAR activity were screened using yeastscreening. The screen was performed in multiple rounds. Each round ofscreening yielded a set of candidate mutations. The candidate mutationswere then validated in mammalian cells. The top-performing mutationswere added to the last version of mutations and re-screened. Themutations screened in 10 rounds are shown in the table below. The mutantidentified in round n was designated as “RESCUE vn-1.” As discussedherein RESCUE refer to mutations that convert adenosine deaminaseactivity to cytidine deaminase activity.

TABLE 13 RESCUE Round ADAR mutations Plasmid RESCUEv0 E488Q pAB0048RESCUEv1 E488Q, V351G pAB0359 RESCUEv2 E488Q, V351G, S486A pAB1188RESCUEv3 E488Q, V351G, S486A, T375S pAB0642 RESCUEv4 E488Q, V351G,S486A, T375S, S370C, pAB1072 RESCUEv5 E488Q, V351G, S486A, T375S, S370C,pAB1135 P462A RESCUEv6 E488Q, V351G, S486A, T375S, S370C, pAB1146 P462A,N597I RESCUEv7 E488Q, V351G, S486A, T375S, S370C, pAB1194 P462A, N597I,L332I RESCUEv8 E488Q, V351G, S486A, T375S, S370C, pAB1220 P462A, N597I,L332I, I398V RESCUEv9 E488Q, V351G, S486A, T375S, S370C, pAB1327 P462A,N597I, L332I, I398V, K350I RESCUEv10 E488Q, V351G, S486A, T375S, S370C,pAB1411 P462A, N597I, L332I, I398V, K350I, M383L

Multiple rounds of validation of RESCUEv10 were performed (FIGS.42A-42E). RESCUEv10 was analyzed by next generation sequencing (NGS)(FIG. 43). Mutations that improve specificity were identified (FIG. 44).Effects of RESCUE on endogenous targeting (C-flips and T-flips) weretested (FIG. 45).

RESCUES were used for targeting β-catenin. FIG. 46 shows targetingβ-catenin using RESCUE v6 and v9. FIG. 47 shows new β-catenin secretedGluc/Cluc reporter. FIG. 48 shows results of targeting β-catenin byRESCUEv10.

RESCUE may also be used for targeting other genes. FIG. 49 showstargeting ApoE4 by RESCUEv10.

Example 8

This example shows based editing β-catenin to increase stability ofβ-catenin using RESCUE to improve proliferation and survival of HUVECsin a nutrient deficient medium.

HUVECs are grown in a nutrient rich medium. Cells are transformed withadenovirus containing RESCUE constructs. The RESCUE targets β-cateninand generate S37A mutation. The transformed cells are passed at lowconfluence into a nutrient deficient medium. Cell proliferation andsurvival rate are measured using a cell-counting kit.

Example 9

This example shows based editing serine protease PCSK9 in HepG2 cells.The base editing modulates low density lipoprotein (LDL) cholesterolupdate in HepG2 cells by inducing patient-derived mutations on PCSK9.

A GFP expression construct is transfected to HepG2 using varioustransfection reagents. The optimal transfection reagent resulting thebest GFP expression is selected for transfecting RESCUE constructs.RESCUE constructs are transfected using 30 bp guides with target site at5′ 5, 7, 9, 11. One or more mutations in PCSK9 are generated by RESCUE.Exemplary mutations are shown in FIG. 50.

RT-PCR and sequencing are performed to identify the best-performingguides. Cytosolic LDL are fluorescently labeled and cellular update ofcytosolic LDL is measured by cell imaging. PCSK9 secretion is monitoredusing ELISA and/or immunoprecipitation.

Example 10

This example lists information and data related to Cas13b-t. Therespective sizes of Cas13b-t1, Cas13b-t2, and Cas13b-t3 are listed inTable 14.

TABLE 14 Naming Key Size Cas13b-t1 804 aa Cas13b-t2 802 aa Cas13b-t3 775aa

Amino acid sequences of Cas13b-t1, Cas13b-t2, and Cas13b-t3 are shownbelow:

Cas13b-t1 (SEQ ID NO: 272)mndkstwqlklhrivrwsflrrqrvgcdishhfdfilvrrsgiknmefenikktsnkevysiegyegekkwcfaivlnraqtnleenpklfeqtltrfekimkqdwfneetkkliyekeeenkvkeeiqiaaserlknlrnyfshylhapdclifnrndtiriimekayeksrfeakkkqqedisiefpelfeeedkitsagvvffvsffierrflnrlmgyvqgfrktegeynitrqvfskyclkdsysvqaqdhdavmfrdilgylsrvpteiyqhikltrkrsqdqlserktdkfilfalkyledyglkdladytacfarskikrenedtketdgnkhkfhrekpvveihfdkekqdqfyikrnnvilkaqkkggqsnyfrmgvyelkylvllsllgkaeeaiqridryisslkkqlpyldkisneeiqksinflprfvrsrlgllqvddekrlktrleyvkakwtdkkegsrklelhrkgrdilryinercdrplsrkeynnilkfivnkdfagfyneleelkrtrrldkniiqklsghttlnalhervcdlvlqelgslqsenlkeyiglipkeekevtfrekvdrileqpvvykgflryeffkedkksfarlveeaiktkwsdfdiplgeeyynipsldrfdrtnkklyetlamdrlclmmarqyylrlneklaekaqhiywkkedgreviifkfqnpkeqkksfsirfsildytkmyvmddpeflsrlweyfipkeakeidyhkhyarafdkytnlqkegidailklegriierrkikpaknyiefqeimnrsgynndqqvalkrvrnallhynlnferehlkrfygyvkregiekkwsliv Cas13b-t2(SEQ ID NO: 273) mqvenikkgssqgmysiegyegakkwcfaivlnraqtnlqgnpklfeetltrferirkedwfdqetkkliyakqeqneveeeiqkaadeklrdlrnyfshyfhtpdcliftqndpvriimekayekarfeqakkeqedisiefgelfeengritsagvvffasffaerrflnrlmgyvqgftrtegeykitrdvfstyclrdsysvktpdhdavmfrdilgylsrvpsesyqrikesqmrsetqlserktdkfilfalnyledygledladytacfartrikreqdentdgkeqkphrkkprveihferaegdpfyikhnnvilrtqkkgaqtyifrmgvyelkylvllsllgkgaeavkridryvhslrnqlphiekksteeiegyvrflprfvrshlgllgvddekkikarvdyvkakwlekkeksrelqlhrkgrdilryinercerplnideynrilellvtkhldgfyreleelkktrridknivenlsrhksvnalhekvcdlvvqeleslgreelkeyvglipkeekevsfeektdrvvkqpviykgflrneffresrksfarlveeavrekgevydvplggeyyeivsldtfdkdnkrlyetlamdrlllmiarqyhlslnkelakraqqiewkkedgeeviiftlknpaqpeqscsvrfslrdytklyvmddaeflarlcdyflpkdeeqidyhrlytqgmnrytnlqregieailelekktigpeqprppknyipfseimdksayneddqkalrrvrnallhhnlnfaradfkrfcgimkregiekrwsl av Cas13b-t3(SEQ ID NO: 274) maqvskqtskkrelsideyqgarkwcftiafnkalvnrdkndglfvesllrhekyskhdwydedtralikcstqaanakaealrnyfshyrhspgcltftaedelrtimerayeraifecrrreteviiefpslfegdrittagvvffvsffverrvldrlygaysglkknegqykltrkalsmyclkdsrftkawdkryllfrdilaqlgripaeayeyyhgeqgdkkrandnegtnpkrhkdkfiefalhyleaqhseicfgrrhivreeagagdehkkhrtkgkvvvdfskkdedqsyyisknnvivridknagprsyrmglnelkylvllslqgkgddaiaklyryrqhvenildvvkvtdkdnhvflprfvleqhgigrkafkgridgrvkhvrgvwekkkaatnemtlhekardilqyvnenctrsfnpgeynrllvclvgkdvenfqaglkrlqlaeridgrvysifaqtstinemhqvvcdqilnrlcrigdqklydyvglgkkdeidykqkvawfkehisirrgflrkkfwydskkgfaklveehlesgggqrdvgldkkyyhidaigrfeganpalyetlardrlclmmaqyflgsvrkelgnkivwsndsielpvegsvgneksivfsvsdygklyvlddaeflgriceyfmphekgkiryhtvyekgfrayndlqkkcveavlafeekvvkakkmsekegahyidfreilaqtmckeaektavnkvrraffhhhlkfvidefglfsdvmkkygiekewkfpvk

Loci of Cas13b-t1, Cas13b-t2, and Cas13b-t3 are shown in FIGS. 54A-54C.The sequences of the loci are shown below:

Cas13b-t1 locus (SEQ ID NO: 275)agctgtcccgctgagatattaacaagcattaccgctaaattttccgcggactgttggttttcagcttcgtgaatgccaacaacaaaaggccctgtcgaaagcacaatttcggtggtgtcatagaaatccaggactttgccttcgagggttttattggttgccttctttgctgtggcgccattttcaatcagaaagctgcgatagctttctgcgactgcctcggcatctttgggaccggagcgtttgctcagaaatgccgtgatggtttcaccgttaagctggtatccggcagcgaagatgtcagtcaatccttcaaagccaaatgcacttgccagataaagataatgatcctgggaccaaattatcctttggcaggtgctcgatttcaggtatagcggtatcatcgtgaacggccaggttcgtgggaattttccttgcgacttccgccattgccgcaaacagctcatccgattcggcgaagccgaccagctcgatataatattggccgtgcgcaagataaaacgcattactggttttgtatgcaaattgcatatccggcaggttctcaacttcgggcctttttgcacgctgtaaaccgagaatgcgtttctggttctggccatatcaaagatatagagctccatcaccaggttttcatccgcctggcttacaaatctctgggtggacaattttataaaaccagcgtcgatataaaggggggccttgccgttaatcttttcgtaaagattttcggtggtgtagacttcaatttctgaaagcgttttgaatccgtaaggcagaagaaaagtcaggtctttcttttgttttggcatctgctttatgaataccccaacggcgataagtaagagaatcgctaataagcagatgcctataacagattcgagacgttttgcccggcttggtaccgaacccataaccaactccagtaatgacaaattacttgactttataaccgggctggattataatttttgccggtgttgctgtcaaccccaaatgctacaggtgaaaaaggcgaagatagatttctaacgaggttgacaaagcaggtcagggcgtgttataataggttgctaaagtaaaaaggagactgaaatgattgaatatgcacaatatttggggttttggacgccgggcccccttgaaattgctgttattgcgattgtcgctcttctgatattcggcagacggctgcctgaaatcgcccgcaacgtaggcaagagcctgactgaattcaagaaggggatcacgaggccaaggagaccaaggacgaattggtggatgatgtccgggaagtcaaggatgatgtggtaagagaggcgaaggatgccgccgggctgaatgaagaggatacaatgggctctgattgattattgataaaggggaactaatcactgagaacaattgtcaatcattaatcaacaatcaatattgaagatccgcctgtggcggaatcaatttttaagatgggcgatacaaagaagaaagaggacctccttgattccactatgagtctgggcgaccaccttgaggaattgcggatgcggctgattcgcgcgctggtgggcctggcgttagctatattatctgtctgatcttcggcaagctgctgatatcatttattcaaaaaccttacgttgctgtgatgggtgaagaggctactctgaagacgcttgccccggcccaagggattaacagctacgtaaaaatagccttggtctcaggcttgatattctcatcgccctgggtatctaccagttatggatgttcgtggctgcaggactctatcctaatgaaaaaagatatgtgtatgtagcagtacctttttcggtggtattatttgttgccggagctttgtttttcatctttgtagtggcagaagtgtctcttgctttcttaataaaggtcgacaggtggctcggactggaacccgactggactttcccgaagtatgtgacctttgtaaccaccctgatgctggtatttggtgttgcgtttcagaccccgatagctattttctttttgaacaagacaggtctggtttcagtccaggcgttacggcggtcaagaaaatatgtactgctacttatcgttgtagtagcagctatggcgactccgcctgatgtggtttctcaagtaacactggcgataccgttgtatgtgctgtttgaattaggcatactgctgagttactttgcagaactaaaaaagagaaagtcgaaaaacaaccagtgataagccgacaatccccagctttcccagtaccgactacttgtttctttcgggcctggtttttatttcgtcaatcgagcgactaagaaatcttcaaaggcgcttaaatccttccataccgtggcacagttaatggttttggctttgttatctattacggtgtatccatagtcggtaacccgaatgccgagtttttcgggctcattttagacatttgcatctatgccgccggcagcgctgaaggttttttcggagctaattgagtattcagcataaatgttgaacggttttgccaatgcgggtactatgatgttgatgctaacgttgataaatacaaatgtgatggtccctcccatagggcctgtcggcctggactatatcgcaggagccgtcagggcagccgggaaccaggcagacgtagttgatttatgtcttgctgatgacccgtcaaagactctccagggctatttcgctacgcacagcccgcaattggtgggggtctcttttcgcaatgtggacgattctttctggccaagcgcccggtggttcgtccccgacctggctgacactatccgtacgatacgaagtatgacggatgcaccaattgtagttggcggcgttggcttttccattttttccgagcgaatcgtcgaatataccggcgctgactttgggattcggggcgacggagagcaggcaatagtttcacttcttaatcagctgcagcggccggaacggcttgaacgcatagatgggttagtccggcggcgcgacggagttattcacagcaaccgaccagcgtggcctgcaccgctttctttgcgcaccgaacgtgatgcgattgataacctcgcttacttcaaaaaaggagggcagtgtggtgtggagaccaaacggggctgtaaccgccgatgcctatattgtgccgacccgctggctaagggtgcggcagtcaggccgagggccccgtcggaggtcgccgatgaggtccagtctctaataggcaagggaatagaagtattgcatttgtgcgactctgagttcaacatctctcaaagccacgcctatgcggtctgcgaagagttcagccgtcgctcatttgcgaaaaaggtgcgctggtacacatatatggcggtggtgccattcgatgccgagcttgccggggctatgagcagagcgggctgtgtcggtatcgactttaccggcgactctgcgtgcccatcaattctaaagacctatcgccagcggcatcataaagaagaccttgcctcggcggtgcgtttgtgccgtgctaacggcataacggttatgatagacctgctgtttggcggcccgggtgaaacgccggaaacggtcgcagagacaatagatttcattaagcaaattgacccggattgcgcaggggctccgctcggtataagaatctaccccggcaccgaaatggcccgaatagtggcaaacgaaggcccaccggaaacgaacccgaacgttcaccgaaagtacgaggggcctgtggatttcttcaaaccaacttactatatatctgaagccctcggtgagcagccggccgggcttatcaaggatttgatttcggcagatgaaagattctttgagccgatgccggaaatagccccggaggctctaaaaagtagccagtccaccgaccacaattacaatgataataccgaacttgtagaagcaatcagcaaaggtgcacgcggggcatattgggatatactgcgcaagcttcgctgcgactaagcagcttatggtagtagatgattcccgcctgcgggagattggcccgaatcctgaggaatttgttagaagcggatgcaatgttgatttttggggtaaaaacgggggcagggggatttggtccccggtttgaggattccgagaagcccacccgtagggatctccgctcccttagggataaattcgcttcgagtttgaaattggtccccggtttgaggattccgagaagctcacccgtagtgatctccgcttcgcttcggctttgtttgggtttgttttcccgcgtctgcgaagtggttcattttcataatcctttataacatataagtttacgttcattttgggctttcggcaaattgggtttgaattgggtttgtttttttggactgcgaaatcatctttttttctgtaaacctttgttataagagagtttacattcatttgggcatttagtaaattgggtttgattggctttgaattgggtttgttttcaccaagtgtccaattggatttattttcataatcctttgtattatatggattacgttcatttgagcatccagaaaattggctttgttttgcataaaaagggctgatttgtagaggactctttacagttgtagagggcaagttagttaagagtgagctaaagtgcctaaagtgaactaaagttggattctcgattctcgtatagcgtatagcgtatttcacggttattcaccattcattaaggaataaatttgattaggcctgctggcccctccggcgattagtaaatggttctcggcggcaaaacaacgcgcctctataattgggcgaacatgcacgtttgagtcgaaaattggtgctttcttgacaggataaacaggagtaactcgttgtgagaaaaggagtaaaattttttttcaattttccgattttaggttccaactacctgcacttttgattgaaaaatcacaaatgtcttgcctattttaacgcagtttttcgtcgaaacgtcagcgaactaggaaaataggcgatttctgggggaaaacaaataaaaaatgcacaaaagtgacaaaaaaacggccaaaaaagtgctttttttggctgcctttaccccgtgagatgatttaccaaaccttcctctgctattcctatgcaagtttgctcagggctggtgtgaatactataaaaatttgtgctgtaatcactccacaaatcggaggcttcttcagcgtggaaattctggaggccaaaatgaaatacgctgtaatcaccccacaaatcggaggcttcttcagcttcactacctctcaaatcgcccaactatacgctgtaatcaccccacaaatcggaggcttcttcagctcgcaagtcccgtccacgcacaaagtttgagctgtaatcaccccacaaatcggaggcttcttcagcatgagcttttggttgtgctggatatgccagctgtaatcaccccacaaatcggaggcttcttcagcacaaaacggttcaacaaggtcgaagaactagctgtaatcaccccacaaatcggaggcttcttcagcttctgcggagtctttcgccggtgttcaaatgctgtaatcaccccacaaatcggaggcttcttcagcctatcctttataatacattttcctatatagatttacaatacaaaacccacgacaaaactgacttcttcttttgaatcatgccgtattataacacttttttacactatcaaagaccactttttttctattccttctcttttcacgaccccatagaatctcttcagatgttccctctcaaaattgagattatagtgcaaaagcgcatttcgcacccgctttaaagcaacctgttgatcattattataaccgcttctattcattatctcctgaaattctatataattttttgctggtttaatctttcttcgttcgataatccttccttcaagctttagtattgcatcgattccctctttttgaaggtttgtatatttgtcgaacgcccttgcatagtgcttatggtagtctatttcttttgcttcttttgggataaaatattcccaaagtctgcttaaaaattcaggatcgtccattacatacatctttgtataatccaagatcgaaaagcgtatcgaaaaactcttcttttgctcttttggattttggaatttgaaaataatcacttctctgccatcttccttcttccaatagatatgctgtgccttttctgcaagtttttcgttcaatctgagataatattgccttgccatcataaggcaaagtctgtccattgccagtgtttcatatagcttcttgtttgttctgtcaaatcgatcaagagatgggatgttataatactcttcaccaagaggaatatcaaaatccgaccactttgtcttaattgcttcttcaacaagtctggcaaaactctttttgtcttctttgaagaattcgtatctcaaaaatcccttataaacaaccggctgttccaaaatcctatctaccttttctctaaaagttacctctttttcttctttaggtatcagcccaatatattccttgagattctccgattgcaaactgcccagttcttgtagaaccaaatcacataccctttcatgaagtgcattgagcgttgtatgcccggaaagcttctggataatatttttgtctaatcgtctggttcttttcagttcttcaagttcattataaaatccggcgaagtctttgttcactataaactttaaaatattattatattccttcctgctaagtggcctatcgcatcgctcgttgatatatctgagtatatcccttccttttcgatgtagttcaagcttcctcgatccctcttttttatccgtccacttggccttaacatattccaatcgagtctttaaccttttctcatcatcaacctgtaaaagacctagtcttgaacgtacgaatcttggaaggaagtttatagatttttgaatctcctcattacttattttatctaaataaggcaactgcttctttaaactactaatatagcggtcaattctttgaattgcctcttcggcttttcccaatagactcaaaagaacaagatatttaagttcataaactcccatcctgaatacgttggactgtccaccttttttttgagccttcagaataacattatttcgtttaatataaaattggtcttgcttctctttgtcaaaatgaatctcgactaccggcttttccctgtgaaatttgtgtttgttaccatctgtctcttcgtatcttcgttctccatttaattttacttatgcaaaacatgctgtgtagtctgccaaatccttaagtccataatcctcaagatatttcagtgcaaataatatgaacttgtccgtattattcgctcaactgatcctggcttctctttcgagttagtttgatatgctgatatatctcagtgggaactcgggacaggtatccgagaatatccctgaacataactgcatcatggtcctgcgcctgaaccgaataactatccttaagacaatatttggaaaaaacttgccgtgttatattatattcaccctctgtttttctaaaccatggacatatcccattaagcgatttaaaaatcttattcaataaaaaatgagacaaagaatactacacctgctgatgttatcttatcttcttcttcaaataactctggaaattcaatcgaaatatcttcttgttgttttttcttcgcttcaaaacggctctttcgtatgctttttccataattatccttatggtgtcatttcgattgaatatcaggcagtcaggcgcgtgaagataatgtgagaaataattccttaaattattagtattcactggccgctatttgaatttatcttttactttgttttcctctctttttcataaatcagttttttgtttcctcattaaaccaatcctgtttcatgattttttcaaatcttgtaagtgtttgctcaaataactttggattttcctctaaatttgtttgtgctctattaagaactattgcaaaacaccactttttttctccttcatattgctcgatagaatacacttctttattgcttgttttttttatattttcaaactccatatttttaatccccgatcttctactaatataaagtcaaagtggtgcgaaatatcgcaccctaccctctgcctgcgcaggaaagaccaacggacgattcgatgcagtttgagctgccaggtgatttatcattcaaggggtaaaatagcagaaaagccttaatgtgtcaaggggattttagatttactatttccaatttacgattttggattgagattgatcggcctaaaagacaggcctcgcaatgacccccttagagttgaaagcactctaaacaagggggcaggcggggggaatatcgaatatcgaatctgaatgtccaatgtcgaagtgcaactgcgcgggaatgacaggtcggcagatttatttaattctgtggccagatccctccgcttcgtccacctgcggtggacttcggtcgggatgacactgggggtgtgccattgctccgctcg Cas13b-t2 locus(SEQ ID NO: 276)agccgagtcgatggtagctaaggtgaacgacaagcgtggttatacggagataagttgatgcgactggttatccatgaggacgaagtagattcgatggattggttatatggaattagatcgaaacgtatacctatgaagctcacagtcaaataccaatcgggatagaaatgcggcgcgcgcaaccttaggcaaaggcttggctgtttcagcgtttccgctttacgtgcccgtttagccttcaccatagtccacctttccgcaagcctccctgcgatcccggacagtcggatttcccaaatccggttctggtctcggccctatttgtcattttctggataaaggccttcctgtacaatttgagacttaagtgctagctcacttgcaccccataattgtacagtttaccagtatcctcgttccgagagtccatggcattcgttccagttcggtgcctggatgcacatgccttactcagaaccaccgagtacccagagcccattgtcaggcgtgggcgctacccactacctggatgactttgaaagtcacctcagaagacattactatccttcatagctcatacggactcatgcgccagaccaaatccctcccaacgtcttggttttcccttgtacgttaggtctttgcaggttgtcgccagtccctgctgggaaacggccatcccgacattatctctgcaatccttgtataggtgcaaggaccataccccgcagcgtcccttcggtgcccttgcccgtttcttcccgaaggactgcggtctcacctcaggatttaaaggttcgacacgccaattatccgtcgcaatgcaacttcaacaacggggcaaatttcggggctgcagtcattccataacgttcaagctcctatacctgctatgccctgcggttgcacccaccactgagcatatatgagctcagggcagccgggccgtttacaccacgcatcgcccggatggttacccattccgagatgtggcatcgctacgtgcctgaatcgggcaactggcacgacgggactttcacccgctggattgcagccttgtcggctgctccaaatccctgttgccacaaaaattttctttgaggcatccacgttacgacgtgtcggccacgcttcgtagatatctgaacagcctttcgacttcacgatgcattataatggacatcgtgaatctagctatgtcatggtcaatcgtagtgtggccaggggccggccaatcgagtatacttgataaagtgttcatgaagctgtattcttaatctcccaagagtatgcttcgaacgtttaagaaatgacagatagtggtgaagtggtctgaaaacgggcccgggaggcgaagacgtgagtacaggtacagtgaaatggtttaatgcaagaaggggatacggttttattgtccccgatgatggcggagatgatttatttgttcaccgttcggacattaacacagaggactatgcatcgcgagattattaaggtcggcaacgacatggttgccgaccatatcaatcataagggcttggataatcgcaaggccaatttgcgagcggcgacgattgcgcagaatgcgtggaaccgccagcgcaaaagaagcggatttatgggcgtagtgtggaataagcagatgaggaaatggcgtgttaatatcagtcacgagggcacgtgcaggcatatcggctacttcgatgatgaggttgaagcggcgaaggcgcacgaccgggcagcgaaaaaatatcacggagagttcgcgagtttgaatttcacgcgttaaagccacatcacagcgagtccgactatggcggacgcagcaatcttaagcatatttggctgcgcaatgtgttgcgcgggttcctgcttggggcgagctatcgaggtgtaatcaccccacaaatcgggggcttctccagcgccgtaaaagttgataagaattttagatgcgccgtaatcacccctcaaatcgggggcttctccagcgctgaccgaattgataaaaccaagagagcgctgtaatcaccccacaaatcgggggcttctccagctgtacgacaaatcataacagaatatttgaagctgcaatcaccccacaaatcgggggcttctccagcaaaatgagacaccacgcttgacgtcactgtgctgtaatcaccccacaaatcgggggcttctccagcttcgagctatatctggctcggtctgatttggctgtaatcaccccacaaatcgggggcttctccagcactggcttagcaagttcctttgggcgtttcgctgtaatcaccccacaaatcgggggcttctccagctaatcgaagatgagaccgaagactatcactgctgtaatcaccccacaaatcgggggcttctccagctgattgggaaagcactccttacgcacgagagctgtaatcaccccacaaatcgggggcttctccagcacatcctcgataatacgttatctcgattggatttacaacagaaaaatcactgaaaataccagggtttttggtgcaatgcgcacacattagaacctgttttcatactgctaaagaccagctttttcaatcccttcccttttcataattccacagaacctcttaaaatctgccctggcaaaattaaggttatgatgcaaaagcgcgtttcgcacacgtcggagagctttctggtcatcttcattgtaggcgcttttccattatctcgctaaatgggatgtagttctttggaggtcttggctgctctggaccgatagtctttttcaagctcgagtatggcttcaattccttccctttgcaggtttgtgtatctgttcatcccttgcgtataaagcctatggtagtcgatttgttcttcgtcttttggcaaaaaataatcgcaaagtcgggccaaaaactccgcgtcgtccatcacatagagtttcgtataatccctcagcgagaaccgtaccgaacaactctgctccggctgtgccggattcttcaaggtgaaaataattacttcttcgccatcctctttcttccactcgatttgctgtgccctcttggcaagctctttgttaagactaagatgatattgccttgcgatcatcagcaaaagcctgtccattgccagtgtttcatacagtctcttattgtctttatcaaacgtatcaagtgacacgatttcgtaatactccccccccagaggaacatcataaacctctcctttttccctcaccgcttcttcaacaagcctcgcaaaactctttctgctttctctgaagaattcattcctcaaaaatcctttataaataaccggctttcacaaccctgtccgtcttttcttcaaatgacacctcttttcttctttgggtatcagtccaacatattccttcagttcttctctgcctaggctttcaagttcttgcacgaccaaatcgcacaccttttcgtgcagcgcattgacgcttttgcctggaaagattgcacacgatgttcttatctatccgtctggtcttcttcaattcttcaagctctcggtaaaacccgtcgaggtgcttagtgaccaaaagctccaaaatacggttatattcatcgatgttcagcggcctctcacaccgctcattgatatacctcagaatatcccgtccttttcgatggagctgaagctccctcgactttcttttttccaaccacttggccttaacataatcaactcgcgccttgatcttttttcatcatcaacccctaagagacccagatgggaacgcacaaacctcggaagaaatctcacgtatccttcaatctcttccgtgcttttcttctctatgtgaggcaactggttgcgcaagctatgaacatacctgtcgattcttttgactgcctctgctccttttcctaataagctcagtagaacaagatatttaagctcgtagacgcccatcctgaatatataggtttgggcgcctttcttctgagttcgcagaatgacgttattgtgtttgatataaaatgggtctccttcggctctctcaaaatgaatctcgactctcggcttcttcctgtgaggtttctgctccttgccatctgtattttcgtcctgctcccgcttaatcctcgttctggcaaaacatgctgtgtagtctgccaaatcctccagcccgtaatcctcaagatagttcagcgcaaacaatatgaacttgtccgtattattcgcttaactgggtttcgcttcgcatttgcgattattgatacgctgatacgactcactgggaactcgtgacaaataccccagaatatcccggaacatgaccgcatcatgatccggcgtataaccgaataactgtccctaagacaatatgtcgaaaaaacgtcccgcgttattttatattccccctctgtacgcgtaaacccctgaacatatcccattaaccgatttaggaaccttctctcagcaaaaaatgacgcgaaaaataccacacctgctgatgttatcctgccgttctatcgaacaactccccaaattcaatcgaaatatcttcctgttccttttttgcctgttcaaaacgcgccttttcgtacgctttttccataattatcctgaccgggtcattttgggtgaatatcaggcagtcaggcgtatgaaaatagtgcgagaaataattcctcaaatctctaagatttcatcagccgctttttgaatttcctcctctacttcgttttgttcttgttttgcatagatcagttttttcgtttcctggtcaaaccaatcttcctttctgatcctttcgaatcgtgtcagcgtttcctcgaacaacttcggattcccctgcaaatttgtttgcgccctattaagcactatcgcaaaacaccacttatggccccctcatattgctcgatagaatacattccttggctgcttcattcttgatattttcaacctgcatatctcagactctcccaattgttgtttttcgccatttttgttgaagtccccgaatgtcagtctattgggccagctgagtcaacccacaaggcacaatgtacatacagtctcgagtcatttcgagaagactttccgctcgcccgataagataagctttgagtatctcacggggtggacccgagcagataattccacatctcgtatccggtgaagctatccggcataaattcgtgcttagtgaatcgtgtttcgtgttgatacggctcccggctgcattcacttttcacggcagagaatatcgcaaaataaggcaacagtcaaaggaaaaagggtaaaaatggtgaaatagatgagcgagcagtgaattgttgtggcaagcaagccgcaaatgaatccttcggccacgctc Cas13b-t3 locus (SEQ ID NO: 277)tatccaaaatgtggtttgaattcaagaatcaacgctttattccttaaaaaggggcggtgcgatggaaaaagaaccagaaacatccgtgcaatcggcgtcgggacacaatatggatatcccgattgactggtcggtaacctcacgctatttcgaagatgaagatacgctgatgcaggtggtggggatatttgctgaagactctccgcagaccgtccgggaccttgccaaggctatacagacgcaaatatcccaggatgttcaattgcacgctcacagcctgaagggagcctcggctcttatcggggccgaacatctgcggcaaagagcctggcggcttgaatacgccgcccaggagaaaaacacggcggcgtttgaggcgctgtttgacgagacaaaggccgagttcgacaagctgatgtcgttcctttaccgcgccgattggattgaagcagcaaaagaacgccactgcaacaggcaacaggccgagcaggtatgaaacatcttttggaaaagaaggcgatggaatgagtggatggttctccattttgatcattgatgatgacaggatggttacagacaagttggagaagatcagcggcgccaaggctgcaaagaaaaggttcagcctggcaggcgttttctcaaagggcgcctgaagccatttatttgcaggcgtgctaccgcttgtcaacgggcaggggacagaaccgcaatcaggattaccatcagtttcttcattccattaacctcgctttttcctctcgttctttttcttcttcctggttttcgcagcgttgggctgtctttttgccggttttgtatagttgtcgccgtaaatgtcaatgagtgcggcttttagtttttcgggccagttgcggttttcaaaagcgcacacgagcggatcgccgctttgtttcatccagttatgaagccggccctgcatcttcttgatttcgctcctttgctctgcggaatcgataaggttgttgaggcagtcggggtcatttctgagatcatagaactcctcggccgcgcgatatctgaacatcttgacgcgctgtgcggcgaacttattcgttggagcggcctcgaccatcgccttcatggtaaggccctcgttgttgtttcggtaccagaatctgccgtcggcccacggattgaagatatagccgaagcgcttatcctggacgcaccgcatcgggacagcgtctccgccggctttcatgtctatctgcgtaaagaccacgtcgcgtccggattgcttttcgcctttcaacagccccaggaaagaggaaccgtcaagccccctgggtatgcccagaccgaccgcttcgagcaccgtcgggaagaagtcgatccctgagataaagtgcgccttatcgacggcgcctgcttttaccatttgcggccaacgaacgatccacggcgtccgcgtgctggcaagataggcgttgcattttgcaaacggtatggcgatgccgttgtcggagaggaacatcacaagcgtattctcctcgaagcccgactccttcagggcctgcaacgtcttgccgaaggtatcgtcgagtcggcggacggagttgagatagcagctcagttcctgccgaacgcccggcaggtcgcagacaaaaccgggaaccgcaacctcatcgggcttatacgtctttgaaggttcctttgcccccttgattggcttgccgccgatatgatacgggcgatgcggatcgtgcgagttgaccataaagtagaagggcttattctcgcggcgacacctcgccagaaactccttgcagtaaactgtaatagagttccggatcgcggccggcgccgagttccttctggtcatgcacaaaatcccatttgtaatccgcatggggcgttgagtgccccaccttgccgagaataccggtaagatagccggcatccctcagcgtctgcatgacagtcatcaccaaggcggcgcccaatcctgcggcccttagaaaatcacgacgattcatcattgtccccactaatccttattgttcttctcaagataccccgacaatttctgcatttgccgatacaggccgccgggacatatcagtatagccgcaaaccttgaaaatatcaacctcccggaatataacgtcgacttccaacccagatcgccaatccagaataagaaaacaaagcaaaacgcttcaaattcgtttaaccccagggttcgcctgaggttcgtaaacaccatctcgatgtacatcgggattcaaattcgttgagccccagcccttcttgtggctcttgttcggcaagaaacgctgtaatcaccccacaaatcgggggctgctccagcatcgccaagacgggcaatgccgctttgaggctgtaatcaccccacaaatcgggggctgctccagctgatttcgagtttcgatgctttcggacagggctgtaatcaccccacaaatcgggggctgctccagcactccttatggagaaggagcttatcgtgtcgctgtaatcaccccacaaatcgggggctgctccagcttattccttccatcatcccgacagcagtgggctgtaatcaccccacaaatcgggggctgctccagcccactttcgtaaccattttactcgcaaacgcttataacgaaaacactttccaaaaaccataccaacgtcctcatttaacaggaaacttccactccttttcaattccatatttcttcataacatcactaaacaacccaaattcatctatcacaaactttaaatgatgatggaaaaacgctctacgcaccttattcacggcggtcttctccgcctctttacacattgtttgtgccagtatctcacgaaaatcaatataatgcgccccttccttctcgctcatctttttggctttgacaaccttctcttcaaacgccagcaccgcctcgacacatttcttctgcagatcattatatgccctaaaccctttttcgtaaactgtatgataccgtatcttccctttttcgtgcggcataaagtactcacatatccgcccaagaaactcagcgtcatccaacacatataacttgccgtaatcactcactgagaagacgatgcttttttcgttacccactgagccctccacgggcaactcgatgctatcattcgaccacacaattttattacccaattccttgcgtacactccccaggaagtattgcgccatcatcagacacaaacggtctcgcgccagcgtttcatacaaggctggattagcaccctcgaatcgcccaatcgcatcaatatgataatactttttatccagcccaacgtccctctgtccgccgccgctttccaaatgctcttccacaagcttcgcgaatcccttcttgctgtcataccagaacttcttgcgcaagaaacccctgcggatagaaatatgctccttgaaccatgcaaccttctgcttgtaatctatttcatccttcttcccaagccccacataatcgtagagcttctgatcgccgattcggcaaagtctgttgagaatctgatcacacaccacctgatgcatctcgtttattgtggaggtctgcgcaaaaattgaatatacccgcccgtcgattcgctcggccagttgcaggcgtttcagtcccgcctgaaaattctcaacatccttgccaaccagacacaccagcagccggttgtactcgccgggattgaaagacctcgtgcaattttcatttacgtattgaagaatgtcccgcgccttctcgtgaagtgtcatctcgttggtcgccgccttcttcttttcccacacccctcgaacatgctttactctgccgtctattctttgcttaaaagctttcctgccaatcccatgttgctccagcacaaatcgcggcaggaagacgtgattatccttatctgtgaccttcactacatccagaatgttctccacatgctgccgatacctgtacagttttgcaatcgcatcgtcgccctttccctgaaggctaagcaatacaaggtatttcaattcgttaagccccatgcgataactccgaggcccggcattcttatcaatcctgacgataacattgttcttactgatatagtatgactgatcttcgtctttttttgaaaagtcgacaactaccttgcctttggtcctgtgctttttgtgttcgtcgcctgccccggcctcctccctgacaatgtgtcgccgcccgaagcatatctcactgtgttgcgcctccagataatgcagtgcaaactcgatgaacttgtctttatggcgtttcggattcgtcccctcattgtcgtttgctcttttcttgtcgccctgctctccgtggtagtattcatacgcctccgcagggatgcgtccaagctgcgcgagtatatccctgaaaagcagcacgcgtttgtcccacgccttcgtgaaacgactgtctttcaggcaatacatcgaaagcgccttccgagtcagcttgtactgtccttcgtttttcttaagcccacttaccgcaccgtacaaacgatccagcacccgccgttcaacaaagaacgaaacgaaaaacacaacccccgccgtagtgatccggtcgccttcgaacaggctgggaaactcgatgatcacttcagtttcgcgtctcctgcattcaaagatcgcccgctcatacgccctttccatgattgtccgcaactcatcttctgctgtaaatgtcagacacccgggcgaatgtcgatagtgggagaaatagtttcttaacgcctcggccttcgcattggccgcttgtgtgctacacttgatcaaagcgcgtgtatcctcatcgtaccagtcgtgctttgaatacttttcatggcgtaacagcgactcgacaaaaagcccgtcgttcttatctcgattcacaagagccttgttgaaggcaatcgtaaaacaccatttccgagcaccttgatattcatcgatagacaactctctctttttcgaagtctgctttgacacttgcgccattgagcacctcccattccagattttagtgcgatctttacctcatgcctccacaacactcccagcgccaaacgttgagcaaagcaaaatacgccgcaggcgggctccgtcgaatccgtaatcctaatttctaacttcccaatcatctaaaccgcccgcaaccgatttgtcaaccaaaaaccacatcaatccgcagatggccgcagataaccgcagatattgcaactaatccacccaacccaaaacctctgttccatctgcgccctctgcgaaatctgcggacagctttttttttcgtgcccttcatgtcttcgtggtgaatttcatttaacatttgacaaatatcaaacggcatggtataatgcgttgcgtatttaaggacaaagcaacaccaaaaacagggggagtaaaaaaccgtgtccatccaaaaagaatcgcaggccgcaggcctgccacctatgattaacctcggtctttcagccaaggatgctccccacacccaaacaagcgaaacgaaccgtgcgccaagctaagctggtgcaattcagcaggtgtaatcctgcccggtcaaaggttagccgcccggccggaatgaacatgtacgtataaggaggcaacaaat

More detailed sequences and features on Cas13b-t loci are shown in FIGS.55A-55C.

Alignments of Cas13b-t1, Cas13b-t2, and Cas13b-t3 with other Cas13borthologs is shown in FIG. 56. In FIG. 56, Sequence #6 is Cas13b-t1,Sequence #1 is Cas13b-t2, and Sequence #2 is Cas13b-t3. Other sequencesare Cas13b orthologs.

Cas 13b-t is similar to Cas13b from Alistipes sp. ZOR0009 (Cas13b4, NCBIaccession WP_047447904). Human codon optimized proteins (codonoptimization by GeneArt algorithm) synthesized by GenScript intopcDNA3.1(+) backbone for mammalian expression were used. Knockdown ofGaussia luciferase was tested in HEK293FT by two guide RNAs withnon-targeting control. RanCas13b (B6) was used as a positive control.Luciferase values were normalized to non-targeting control—if noknockdown, value ˜1. Some noise was noted in this measurement, so somevalues were slightly higher than 1 but in an acceptable margin to beattributable to noise. Gluc knockdown in mammalian cells by Cas13b-t1,Cas13b-t2, and Cas13b-t3 are shown in FIGS. 51-53, respectively. GuideRNA keys for Cas13b-t1, Cas13b-t2, and Cas13b-t3 are listed in Tables15, 16, and 17, respectively.

TABLE 15 Guide RNA keys - Cas13b-t1 DR is 5′ or 3′ Key to spacer #Direct Repeat Sequence sequence? 1 GCTGTAATCACCCCACAAATCGGAGGCTTCTTCAGC3′ (SEQ ID NO: 278) 2 GCTGTAATCACTCCACAAATCGGAGGCTTCTTCAGC 3′(SEQ ID NO: 279) 3 GCTGAAGAAGCCTCCGATTTGTGGGGTGATTACAGC 3′(SEQ ID NO: 280) 4 GCTGAAGAAGCCTCCGATTTGTGGAGTGATTACAGC 3′(SEQ ID NO: 281) 5 GCTGTAATCACCCCACAAATCGGAGGCTTCTTCAGC 5′(SEQ ID NO: 282) 6 GCTGTAATCACTCCACAAATCGGAGGCTTCTTCAGC 5′(SEQ ID NO: 283) 7 GCTGAAGAAGCCTCCGATTTGTGGGGTGATTACAGC 5′(SEQ ID NO: 284) 8 GCTGAAGAAGCCTCCGATTTGTGGAGTGATTACAGC 5′(SEQ ID NO: 285)

TABLE 16 Guide RNA key - Cas13b-t2 DR is 5′ or 3′ Key to spacer #Direct Repeat Sequence sequence? 1 GCTGTAATCACCCCACAAATCGGGGGCTTCTCCAGC3′ (SEQ ID NO: 286) 2 GCTGCAATCACCCCACAAATCGGGGGCTTCTCCAGC 3′(SEQ ID NO: 287) 3 GCCGTAATCACCCCTCAAATCGGGGGCTTCTCCAGC 3′(SEQ ID NO: 288) 4 GGTGTAATCACCCCACAAATCGGGGGCTTCTCCAGC 3′(SEQ ID NO: 289) 5 GCTGGAGAAGCCCCCGATTTGTGGGGTGATTACAGC 3′(SEQ ID NO: 290) 6 GCTGGAGAAGCCCCCGATTTGTGGGGTGATTGCAGC 3′(SEQ ID NO: 291) 7 GCTGGAGAAGCCCCCGATTTGAGGGGTGATTACGGC 3′(SEQ ID NO: 292) 8 GCTGGAGAAGCCCCCGATTTGTGGGGTGATTACACC 3′(SEQ ID NO: 293) 9 GCTGTAATCACCCCACAAATCGGGGGCTTCTCCAGC 5′(SEQ ID NO: 294) 10 GCTGCAATCACCCCACAAATCGGGGGCTTCTCCAGC 5′(SEQ ID NO: 295) 11 GCCGTAATCACCCCTCAAATCGGGGGCTTCTCCAGC 5′(SEQ ID NO: 296) 12 GGTGTAATCACCCCACAAATCGGGGGCTTCTCCAGC 5′(SEQ ID NO: 297) 13 GCTGGAGAAGCCCCCGATTTGTGGGGTGATTACAGC 5′(SEQ ID NO: 298) 14 GCTGGAGAAGCCCCCGATTTGTGGGGTGATTGCAGC 5′(SEQ ID NO: 299) 15 GCTGGAGAAGCCCCCGATTTGAGGGGTGATTACGGC 5′(SEQ ID NO: 300) 16 GCTGGAGAAGCCCCCGATTTGTGGGGTGATTACACC 5′(SEQ ID NO: 301)

TABLE 17 Guide RNA key - Cas13b-t3 DR is 5′ or 3′ Key to spacer #Direct Repeat Sequence sequence? 1 GCTGTAATCACCCCACAAATCGGGGGCTGCTCCAGC3′ (SEQ ID NO: 302) 2 GCTGGAGCAGCCCCCGATTTGTGGGGTGATTACAGC 3′(SEQ ID NO: 303) 3 GCTGTAATCACCCCACAAATCGGGGGCTGCTCCAGC 5′(SEQ ID NO: 304) 4 GCTGGAGCAGCCCCCGATTTGTGGGGTGATTACAGC 5′(SEQ ID NO: 305)

Example 11

This example summarizes the results of RESCUE rounds 1-12 (see FIGS.57-68). Additional phenotypes tested included PCSK9, Stat3, IRS1, andTFEB. PCSK9 showed cloning improved the promoter. Stat3 showed ˜10%editing on sites. Inhibition of signaling will be tested with aluciferase reporter. For IRS1, targeting of synthetic site will betested before moving to pre-adipocyte cells. For TFEB, targeting may bedesigned to cause translocation of transcription factor->autophagy. Inaddition, a panel of 12 endogenous phosphosite targets and 48 synthetictargets will be tested. Screening in yeast will continue on V11background with S22P. Top hits were screened on V12 for V13 and newrounds of yeast hits will be evaluated. A few hundred additional screenhits on luciferase will be evaluated and Ade2 editing will be validatedfor specificity screening. Gene shuffling will also be tested forlibrary complexity and different yeast reporters.

Example 12

This example lists further information and data related to Cas13b-t.

Knockdown of Gaussia luciferase in HEK293FT cells by two guide RNAs weretested. RanCas13b(B6) was used as a positive control. Luciferase valueswere normalized to non-targeting control. Some values were higher than 1but in an acceptable margin to be attributable to noise. The value wasabout 1 if there was no knock down. The dead versions have both arginineand histidine residues in both identified HEPN domains mutated toalanine.

The spacer sequences used in the experiment are shown in Table 18 below.

TABLE 18 Name Spacer sequence Guide 1 GGGCATTGGCTTCCATCTCTTTGAGCACCT(SEQ ID NO: 306) Guide 1 GGAATGTCGACGATCGCCTCGCCTATGCCG (SEQ ID NO: 307)Nontargeting GTAATGCCTGGCTTGTCGACGCATAGTCTG (SEQ ID NO: 308)

Comparison of dead and live tiny orthologs for Gluc knock down is shownin FIG. 69.

Recovery of functional cypridina luciferase (W85X) by RNA editing wastested.

Mismatch distance indicated distance from 5′ end of direct repeat to theA:C mismatch that specifics the desired editing site. Spacer sequenceswere all 30 bp unless otherwise indicated. B6 spacer was 30 bp andmismatch distance was 22. REPAIRv1, v2 spacer was 50 bp and mismatchdistance was 34 (as published). The tiny ortholog constructsHIVNES-GS-dRanCas13bt-(GGS)₂-huADAR2dd(E488Q).

Positive control constructs are as follows:

B6 construct: HIVNES-GS-dRanCas13b(B6)-(GGS)2-huADAR2dd(E488Q)

REPAIRv1 construct: dPspCas13b(B12)-GS-HIVNES-GS-huADAR2dd(E488Q)

REPAIRv2 construct: dPspCas13b(B12)-GS-HIVNES-GS-huADAR2dd(E488Q/T375G)

The data on Cas13b-t1 is shown in FIG. 70 and the data on Cas13b-t3 isshown in FIG. 71, respectively. The guides, non-targeting comparison isshown in FIG. 72. Whole transcriptome sequencing for detailedspecificity and activity analysis can be performed.

Example 13

Programmable RNA editing offers an alternative to genome editing withbenefits in safety and flexibility in targeting. An approach for RNAediting leveraging the Type VI programmable RNA-guided RNaseCRISPR-Cas13, allows for specific adenosine to inosine conversion byguiding the adenosine deaminase activity of a fused ADAR2 to targettranscripts. Here, Applicants expanded RNA editing capabilities to anadditional base conversion by directly evolving ADAR2 to have cytidinedeaminase activity, with a greater than 1,000 fold improvement incatalytic activity. The system, referred to as RNA Editing for SpecificC to U Exchange (RESCUE), lacked strict sequence constraints, editedendogenous transcripts with high efficiency, and performed multiplexed Cto U and A to I editing. Applicants performed additional rationalmutagenesis to generate a highly specific variant of RESCUE, withgreater than 10 fold reduction in A to I off-targets, which retainedefficient C to U on-target activity. Applicants showed herein RESCUE'sability to alter phosphorylation signaling pathways in cells andmodulate STAT activation and cellular growth. RESCUE expanded the RNAediting toolbox by enabling correction of additional mutations andmodulation of more protein residues for broad applicability tobiomedical research and therapeutics.

The programmable modification of nucleic acids in cells has numerousapplications in basic research and therapeutics, especially in thetreatment of genetic disease. DNA editing, typically through generationof double stranded breaks (DSB) to stimulate endogenous DNA repairpathways such as non-homologous end joining (NHEJ) or homology-directedrepair (HDR), has become widely accessible with the development of toolsbased on CRISPR nucleases, including Cas9 and Cpf1/Cas12a. However,introduction of specific edits, including single base changes, relies onHDR and is inefficient in many cell types. Furthermore, the potentialfor off-target cleavage or DNA damage responses poses potential safetyrisks. DNA editors that circumvent DSB formations, such as base editors,provide a viable alternative, although they may be limited by sequencingconstraints, such as the requirement for a protospacer adjacent motif(PAM) near the desired editing site and have significant off-targets.However, temporally controlled editing of nucleic acids through RNA baseediting would avoid many of these issues and have many applicationsincluding modulation of cellular signaling, protein stability, or otherpost-translationally modified residues.

RNA base editing offers an alternative to DNA base editors, leveragingthe adenosine deaminase acting on RNA (ADAR) family of enzymes to enactspecific hydrolytic deamination of adenosine to inosine, a nucleobasethat is functionally equivalent to guanosine in translation andsplicing. Multiple RNA editing technologies have been developed thatdirect activity of ADAR or hyperactive variants to target transcripts,including RNA editing for programmable A to I (G) replacement (REPAIR),which uses the RNA-guided RNA targeting CRISPR enzyme Cas13. While thesetechnologies can effectively convert A to I (G), other base changesremain inaccessible, preventing editing of diverse disease-associatedmutations and functional residues involved in post-translationalmodifications. Cytidine to uridine editing via hydrolytic deaminationactivity would open up the targeting space and provide multiple newtypes of residue changes. However, many cytidine deaminases, such as theapolipoprotein B rnRNA editing enzyme, catalytic polypeptide-like(APOBEC) family of enzymes, can only operate on single strandedsubstrates and will deaminate many of the cytosines in proximity of theAPOBEC binding site.

Here, Applicants take advantage of features of adenosine deaminase,ADAR2. REPAIR, using ADAR2, allows for precise editing via formation ofa double stranded RNA substrate using the guide RNA, which directs ahyperactive mutant of the human ADAR2 catalytic deaminase domain(ADAR2dd[E488Q]) activity to a single adenosine selected by anintroduced mismatch. Applicants performed evolution of ADAR2dd forcytidine deamination to confer this level of precision to cytidine baseconversion. Applicants used a combined rational mutagenesis and directedevolution scheme to iteratively boost the cytidine deamination activityof ADAR2dd more than 1,000-fold. This mutant ADAR2dd fused to Cas13bortholog from Riemerella anatipestifer (RanCas13b) allowed for RNAEditing for Specific C to U Exchange (RESCUE) on both reporter andendogenous transcripts in mammalian cells. Lastly, Applicants improvedthe specificity of RESCUE more than 10-fold via rational mutagenesis anddemonstrated phenotypic modulation of protein signaling and cell growththrough C to U editing with RESCUE.

In order to generate a Cas13b guided-nucleoside deaminase capable ofgenerating programmable C to U modifications, Applicants began a seriesof engineering steps on a RanCas13b-ADAR2dd fusion (FIGS. 73A-73G). Theinitial mutations were selected by saturation mutagenesis at residuesinvolved in the binding of the targeted base. Mutants were evaluated forC to U editing and restoration of Gaussia luciferase (Glue) mutant(C82R) catalytic activity (FIG. 77A). Three rounds of rationalengineering produced a construct (RESCUEv3) with ˜15% editing on the TCGmotif (FIG. 73B). As the surrounding motif strongly determines RNAediting efficiency for A to I editing, Applicants tested for restorationof activity of luciferase mutants with all four possible 5′ bases at theGluc C82R site and two 3′ motifs at the Gluc L77P mutation (FIG. 77B),finding modest increases in activity with these other motifs. To hastenfurther improvements, Applicants began directed evolution across theADAR2dd protein to identify additional candidate mutations forincreasing the activity of RESCUE.

To select for C to U activity, Applicants engineered a set of yeastreporter assays based on either restoration of GFP fluorescence orprototrophic reversion of a HIS auxotrophic selection gene (FIG. 73A,see table 19 for all screens and resulting mutations). With similarapproaches, directed evolution of cytidine deaminase acting on RNA(CDAR) may also be performed.

TABLE 19 RESCUE version number Mutations Screening method RESCUEv0ADAR2 + E488Q   Hyper active variant from Kuttan and Bass RESCUEv1  v0 +V351G Rational mutagenesis RESCUEv2  v1 + S486A Rational mutagenesisRESCUEv3  v2 + T375S Rational mutagenesis RESCUEv4  v3 + S370C Y66H EGFPRESCUEv5  v4 + P462A P196L HIS RESCUEv6 v5 + N597I P196L HIS RESCUEv7v6 + L332I  P196L HIS RESCUEv8 v7 + I398V P196L HIS RESCUEv9 v8 + K350IP196L HIS RESCUEv10  v9 + M383L P196L HIS RESCUEv11 v10 + D619G  S22PHIS RESCUEv12 v11 + S582T  S22P HIS RESCUEv13 v12 + V440I  S22P HISRESCUEv14 v13 + S495N  P196L HIS RESCUEv15 v14 + K418E  P196L HISRESCUEv16 v15 + S661T  S22P HIS

Sequencing FACS-sorted cultures or surviving colonies, for GFP and Hisrestoration respectively, elected individual mutations in the ADAR2dddomain, which were introduced onto the previous RESCUE version andevaluated for activity in mammalian cells on luciferase or CTNNB1editing reporter constructs. These rounds of evolution, culminating withthe final construct RESCUEv16, resulted in a steady increase in activityacross all six motifs tested and reduced the RESCUE and guide plasmiddoses required to edit and restored luciferase activity. (FIGS. 73C,73D, 78, 79A-79B, 80). Additionally, RESCUEv16 achieved higher than 20percent editing on 12 out of 16 possible motif combinations of thedirect 5′ and 3′bases with optimal base flips of either C or U (FIGS.73E and 81). Applicants compared our RESCUE versions with fusions ofPspCas13b and RanCas13b, and found them to be equivalently active (FIG.82). While REPAIR uses 50 nt guides, RESCUEv16 edited the TCG constructoptimally with a 30 nt guide RNA with the targeting base-flip 26 basepairs from the 5′ end of the target (FIG. 83).

To validate the improvements from the directed evolution pipeline in theyeast system, Applicants tested multiple RESCUE iterations for bothactivity in yeast and biochemically. Testing both EGFP and Hisrestoration in yeast, Applicants found that later versions of RESCUEmore effectively performed C to U editing on both targets (FIGS.84A-84D). Biochemical characterization of RESCUE constructs introducedinto purified hADAR2dd protein revealed that RESCUE mutations improvedthe kinetics of C to U editing on substrates in vitro (FIGS. 85A-85B).

Further, Applicants assayed C to U activity in the absence of a Cas13bconstruct. Applicants introduced the RESCUEv16 mutations into both theADAR2 deaminase domain or the full length ADAR2 protein. Applicantsfound that editing and restoration of luciferase activity wassignificantly higher on all 5′ motifs for the complete RESCUEv16construct when compared to ADARdd, full length ADAR, or the absence ofprotein (FIGS. 73F and 86A), and that, while certain guide positionsachieved editing of almost 20% with full length ADAR (FIGS. 86B-86D),maximal efficiency was markedly reduced compared to RESCUE, establishingthat the RanCas13b fusion was necessary for its function. The positionof the 16 mutations in RESCUEv16 place them throughout the structure ofADAR2dd (FIG. 73G), indicating both direct interactions of theintroduced residues with the catalytic pocket, as well as long-rangeallosteric effects.

As RESCUE was evolved to have activity on reporter constructs,Applicants evaluated how well RESCUE could work on endogenoustranscripts in HEK293FT cells. Applicants tested a panel of guide RNAswith varying mismatch positions targeting 24 different sites across 9genes (FIGS. 74A and 87A-87C), specifically choosing sites across thesegenes to have varying 5′ base identities to interrogate the deaminationactivity on different motifs. Applicants found that RESCUEv16 achievedediting rates between ˜5%-35% at all sites tested, and that the idealmismatch position or base-flip was site dependent. Moreover, RESCUEv16outperformed all other versions on multiple endogenous sites andrequired less dosing than earlier versions (FIGS. 74B and 88). To betterevaluate the relevance of RESCUEv16 for therapeutics, Applicantsdesigned a series of twenty-two 200 bp targets to model editing ofdisease-relevant mutations from ClinVar (see Table 20).

TABLE 20 Disease information for disease-relevant mutations CandidateGene Diseases NM_000071.2(CBS): CBS Thoracic aortic aneurysm c.325T > C(p.Cys109Arg) and aortic dissection NM_000141.4(FGFR2): FGFR2 Pfeiffersyndrome/Crouzon c.799T > C (p.Ser267Pro) syndrome/Neoplasm of stomachNM_000551.3(VHL): VHL Von Hippel-Lindau syndrome c.473T > C(p.Leu158Pro) NM_002474.2(MYH11): MYH11 Aortic aneurysm, familialc.3791T > C thoracic 4/Thoracic aortic (p.Leu1264Pro) aneurysm andaortic dissection NM_000018.3(ACADVL): ACADVL Very long chain acyl-CoAc.848T > C (p.Val283Ala) dehydrogenase deficiency NM_002397.4(MEF2C):MEF2C Mental retardation, c.2T > C (p.Met1Thr) stereotypic movements,epilepsy, and/or cerebral malformations NM_002834.4(PTPN11): PTPN11Noonan syndrome c.853T > C (p.Phe285Leu) NM_005609.3(PYGM): PYGMGlycogen storage disease, c.2392T > C (p.Trp798Arg) type VNM_001256850.1(TTN): TTN Limb-girdle muscular c.90211T > C dystrophy,type 2J/Distal (p.Cys30071Arg) myopathy Markesbery-Griggstype/Hereditary myopathy with early respiratory failure/ Myopathy,early-onset, with fatal cardiomyopathy/Familial hypertrophiccardiomyopathy 9 NM_005633.3(SOS1): SOS1 Noonan syndrome 4/Noonanc.806T > C (p.Met269Thr) syndrome NM_015559.2(SETBP1): SETBP1Schinzel-Giedion syndrome c.2612T > C (p.Ile871Thr) NM_004572.3(PKP2):PKP2 Arrhythmogenic right c.2386T > C ventricular cardiomyopathy,(p.Cys796Arg) type 9 NM_000138.4(FBN1): FBN1 Marfan syndrome c.4222T > C(p.Cys1408Arg) NM_000375.2(UROS): UROS Congenital erythropoieticc.217T > C (p.Cys73Arg) porphyria NM_014139.2(SCN11A): SCN11A notprovided/Neuropathy, c.1187T > C (p.Leu396Pro) hereditary sensory andautonomic, type VII NM_000152.4(GAA): GAA Glycogen storage disease,c.1655T > C (p.Leu552Pro) type II NM_020630.4(RET): RET Multipleendocrine neoplasia, c.1858T > C type 2a/Multiple endocrine(p.Cys620Arg) neoplasia, type 2/MEN2A and FMTC NM_000016.5(ACADM): ACADMMedium-chain acyl-coenzyme A c.199T > C (p.Tyr67His) dehydrogenasedeficiency NM_014874.3(MFN2): MFN2 Charcot-Marie-Tooth disease, c.227T >C (p.Leu76Pro) type 2A2A NM_000341.3(SLC3A1): SLC3A1 Cystinuriac.1400T > C (p.Met467Thr) NM_000431.3(MVK): MVK Mevalonicaciduria/Hyper- c.803T > C (p.Ile268Thr) immunoglobulin D with periodicfever NM_004004.5(GJB2): GJB2 Deafness, autosomal recessive c.229T > C(p.Trp77Arg) 1A/Deafness, autosomal dominant 3a/Nonsyndromic hearingloss and deafness NM_000041.4(APOE): APOE Alzheimer disease 2 c.388T > C(p.Cys130Arg) NM_000041.4(APOE): APOE Alzheimer disease 2 c.595T > C(p.Cys176Arg)

RESCUEv16 was able to edit these sites with efficiencies ranging from˜1%-42% (FIGS. 74C and 89). Applicants further tested therapeuticapplications on the ApoE4 allele, which increased ˜10 fold Alzheimer'sris 10 fold and involved two cytosine single-nucleotide polymorphismsthat would need to be converted to thymines to generate the protectiveApoE2 allele. Applicants tested RESCUEv16 on an expressed syntheticfragment from the ApoE4 allele and found that the system achievedediting of ˜5% and 12% on the two sites (FIG. 90).

As RESCUEv16 retained adenosine deaminase activity, the native pre-crRNAprocessing activity of Cas13b enables multiplexed adenine and cytosinedeamination. By delivering RESCUEv16 along with a pre-crRNA targeting anadenine and a cytosine in the same CTNNB1 transcript (FIG. 74D),Applicants found that RESCUEv16 was able to edit both targeted residuesin the same population, converting the adenine to inosine and cytosineto uridine at rates of ˜15% and 5%, respectively (FIG. 74E).Additionally, Applicants found when editing Gluc and endogenous genes, Ato I off-targets near the targeted cytosine occurred within the guideduplex (FIGS. 91A-91C). To eliminate these off-targets, Applicantsintroduced disfavorable guanine mismatches in the guide across fromoff-target adenosines (FIG. 74F). This approach significantly reducedoff-target editing on both Gluc and KRAS while minimally disrupting theon-target editing (FIG. 74G).

The A to I off-targets observed within the guide duplex window suggestedthat RESCUEv16 might have significant off-target adenosine deaminaseactivity across the transcriptome. Profiling off targets withwhole-transcriptome RNA-sequencing, Applicants found that whileRESCUEv16 had ˜80% C to U editing on the Gluc transcript (FIG. 75A), itconsequently had 188 C to U off-targets and 1,695 A to I off-targets,comparable to A to I off-targeting with REPAIRv1, which had 24 C to Uoff-targets and 2,214 A to I off-targets (FIGS. 75A, 75B). To improvethe specificity of RESCUEv16, Applicants performed rational mutagenesisat residues interacting with the RNA target (FIG. 75C), resulting inmultiple RESCUEv16 mutants with reduced A to I off-target activity, asmeasured by a luciferase reporter, and high C to U on-target deaminationactivity (FIG. 75D). The top specificity mutant, S375A on RESCUEv16(RESCUEv16S), maintained ˜76% on-target C to U editing (FIG. 75E), butonly had 103 C to U off-targets and 139 A to I off-targets, anapproximate 10-fold reduction in the number of adenine deaminationoff-targets (FIG. 75E, 75F). Although the off-target editing ofRESCUEv16S was reduced, it still maintained significant on-target A to Iediting activity (FIGS. 92A-92D). Applicants re-evaluated the efficacyof RESCUEv16S on the previous set of endogenous sites and found that itretained similar activity to RESCUEv16 at many sites and at a number ofsites, performed better than RESCUEv16 (FIGS. 93A-93C and 94A).Moreover, within the guide duplex window, RESCUEv16S was much morespecific, having significantly reduced editing at many local off-targetsites (FIGS. 93C, 94B-94E).

The cytidine and adenosine deamination activity of RESCUEv16 allowed formodulation of post-translational modifications via missense mutations,such as the phosphorylation substrates serine and tyrosine. STAT3 andSTAT1 are transcription factors that play important roles in signaltransduction via the JAK/STAT pathway and are typically activated bycytokines and growth factors. To demonstrate signaling modulation viaRNA editing, Applicants altered activation of the STAT pathway byediting phosphorylation sites Y705 and 5727 on STAT3 and Y701 and S727on STAT1 with RESCUEv16 (FIG. 76A). In HEK293FT cells, Applicantsobserved 8% and 9% editing of the Y705 and S727 STAT3 sites,respectively, and 11% and 7% editing of the Y701 and S727 STAT1 sites,respectively (FIG. 76B). These edits resulted in 16%-27% repression ofSTAT3 and STAT1 activity using a luciferase reporter for STAT activation(FIG. 76C).

As with the JAK/STAT pathway, the Wnt pathway can be modulated byphosphorylation of constituent proteins, most notably Beta-catenin.Phosphorylated residues on Beta-Catenin, such as S33 and S37, promoteubiquitination and degradation. Wnt signaling blocks residuephosphorylation and stabilizes Beta-catenin, allowing the protein toengage transcription factors like LEF and TCF1/2/3, promoting expressionof target genes, and leading to increased cell proliferation. Applicantstested a panel of guides against residues known to be involved inphosphorylation of Beta-catenin and found editing levels between 5%-28%(FIG. 76F), resulting in up to 5-fold activation of Beta-catenin (FIG.76G) as measured by a TCF/LEF-dependent luciferase reporter.Correspondingly, cells transfected with RESCUEv16 targetingphosphorylation sites resulted in a 40% increase in cell growth in themost activated Beta-catenin condition, targeting the T41I conversion(FIG. 76H).

RESCUEv16 is a programmable base editing tool capable of precisecytidine to uridine conversion in RNA. Using directed evolution,Applicants demonstrated that adenosine deaminases can be relaxed toaccept other bases, resulting in a novel cytidine deamination mechanismon that can edit double stranded RNA via base-flipping. Applicants havebeen able to boost the cytidine deaminase activity of ADAR2dd 1,000fold, resulting in up to 40% editing on endogenous transcripts. Furtherrounds of evolution may be performed to boost the activity even more.The larger targetable amino acid space of RESCUE's cytidine deaminationactivity increased possible modulation of post-translationalmodifications, such as phosphorylation, glycosylation, and methylationsites, as well as better targeting common catalytic residues (FIGS.95A-95B). Moreover, cytidine deamination activity allows for expandedtargeting of disease-associated mutations with RNA editing andgeneration of protective alleles, such as ApoE2. Overall, RESCUEextended the RNA targeting toolkit with new base editing functionality,allowing for better modeling and treatment of genetic disease.

RESCUE v16S was able to effectively edit endogenous genes (FIG. 96).RESCUE v165 maintained some A to I activity (FIG. 97). RESCUE v16 wasused to target STAT to reduce INFγ/IL6 induction (FIG. 98). RESCUEtargeting induces cell growth (FIGS. 99A-99B).

Materials and Method

Design and Cloning of Yeast Constructs

For expression of the dRanCas13b-hADAR2dd construct in yeast, the fusionprotein was cloned downstream of a pGAL promoter in a pRSII426 backbone,by modifying pML104 (Addgene #67638). To improve expression, a GS linkerwas cloned between the fusion proteins, and ADAR2dd was codon optimizedfor yeast. Additional codon mutations, corresponding to iterations ofRESCUE, were introduced via Gibson Cloning.

Targeting plasmids for testing activity in yeast were engineered forboth fluorescent screens (GFP) and auxotrophic selection screens (His).All targeting plasmids were cloned into the pYES3/CT backbone (ThermoScientific). All plasmids contained a RanCas13b guide cassette forRESCUE, with expression driven by the ADH1 promoter, and spacer and DRsequences flanked by HH and HDV ribozymes [cite ng and dean]. Aconstruct with the spacer replaced by a golden gate site was cloned tofacilitate modular guide cloning.

To generate a GFP indicator of C to U RNA editing activity, the Y66Hgreen-to-blue mutation was introduced into a yeast codon optimized EGFP(yeGFP) driven by the TEF promoter. Successful C to U RNA editingrestores the green fluorescence of this construct. His reporters for Cto U editing were generated by testing conserved residues in HIS3 forloss of activity when mutated to residues that could be rescued by RNAediting. Mutations that created inactive HIS3 were cloned into a HIS3gene, under its native HIS3 promoter, in the pYES3/CT backbone.

Generation of Mutagenesis Libraries for Yeast Screening

To generate mutagenesis libraries for screening mutations in yeastsystems, the hADAR2 deaminase domain was mutated using Genemorph II(Agilent Technologies) for error-prone PCR across eight 50 mL reactionsdiffering in template input from 74 ng-9.4m via a two-fold dilutionseries. Following amplification, reactions were pooled, diluted 1:4 inDI water and loaded into a 2% gel containing ethidium bromide. Extractedsamples were purified using a MinElute PCR Purification Kit (Qiagen)before treatment with Dpn1 (Thermo Fisher Scientific) at 37° C. for 2hto remove residual template plasmid and subsequent gel and MinElutepurification.

Backbone was generated by digesting 7 μg of template plasmid with KflI,RruI, and Eco72I (Thermo Fisher Scientific) for 1 hr. The digest was gelpurified with the MinElute PCR Purification kit and eluted in 30 μL ofpre-warmed water.

The purified PCR insert and digested backbone were ligated using GibsonAssembly (New England Biosciences), specifically, 456 ng of PCR insertand 800 ng of backbone digest were run in an 80 μL Gibson reaction for 1hr. The product was condensed using isopropanol precipitation andresuspended in 12 μL of TE-EF redissolving buffer (Macherey-Nagel) andheating to 50° C. for 5 minutes while shaking at 300 r.p.m. 50 μL ofEndura Electrocompetent cells (Lucigen) were thawed on ice for 10minutes and 2 μL of resuspended Gibson product was added. The mixturewas electroporated using a GenePulser Xcell (Bio-Rad) following optimalEndura settings (1.0 mm cuvette, 10 g, 600 Ohms, and 1800 Volts).Samples from each electroporation were recovered in 1 mL of RecoveryMedia (Lucigen) and incubated at 37° C. for 1 hr while shaking at 300r.p.m. Two electroporations were performed per mutagenesis library. Therecovered culture was plated on a large pre-warmed 100 μg/mL ampicillinplate. Serial dilutions were prepared to determine the c.f.u. of eachlibrary. Plates were incubated at 37° C. for 16 hr and harvested usingthe Nucleobond Xtra Maxi Kit (Macherey-Nagel).

Transformation of Mutagenesis Libraries in Yeast

Large scale yeast transformation was carried out as previouslydescribed. Briefly, colonies containing the Y66H EGFP or HIS3 reporterplasmids were picked into 300 mL-Trp 2% glucose selection media andgrown up overnight at 30° C. After growth, the OD600 of the cells weredetermined and 2.5e9 cells were added to 500 mL of pre-warmed 2×YPAD andincubated for 4 hours at 30° C. The cell pellet was washed multipletimes and then resuspended in 36 mL of transformation mix containing 24mL of PEG 3350 (50% w/v), 3.6 mL of 1.0 M Lithium acetate, 5 mL ofdenatured single-stranded carrier salmon sperm DNA at 2.0 mg/mL(ThermoFisher Scientific), 2.9 mL of water, and 500 μL of 1 μg/μLplasmid library. After incubation at 42° C. for 60 minutes, the cellpellet was resuspended in 750 mL of -Ura/-Trp 2% glucose selection mediaand grown overnight until the culture reached OD600 of 5-6. At thatpoint, 6 mL of the culture was seeded into 250 mL of 2% raffinose-Ura/-Trp selection media and incubated until the OD600 was 0.5-1.Cultures were induced by adding 27 mL of 30% galactose and incubatedovernight at 30° C. for 12-14 hours. Cells were then either subjected tocell sorting or plating on selection plates, as described below.

Fluorescent Cell Sorting of Yeast Libraries

After induction, cells were sorted on a SH800S Cell Sorter by gating forEGFP fluorescence compared to a negative non-induced and non-targetingguide control. After 100 million cells had been sorted into 2% glucose-Ura/-Trp selection media, Applicants incubated the sorted cellsovernight, and then seeded them into 2% raffinose -Ura/-Trp selectionmedia when their OD600 was 5-6. Cells were then induced when the OD600was between 0.5-1 and incubated overnight for 12-14 hours before sortingagain. Sorting was performed until 10-20 million cells had been sorted.The iterative growth and sorting was repeated 2-3 additional times andeach iteration of sorted cells was plasmid harvested and sequenced byIlumina NextSeq next generation sequencing to ascertain the mutantspresent at each round of selection. Top enriched mutants wereindividually ordered and cloned for mammalian validation testing asdescribed below.

His Growth Selection of Yeast Libraries

After induction, the cell library was plated on 2% raffinose/3%galactose -Ura/-Trp/-His selection plates. As colonies grew, they werepicked into water and streaked on 2% raffinose/3% galactose-Ura/-Trp/-His selection plates. After overnight growth of the streaks,colony PCR was performed on each streak and subjected to sangersequencing of the ADAR2 catalytic domain as well as the His gene tocheck for recombination and DNA mutagenesis. Mutations were individuallyordered and cloned for mammalian validation testing as described below.

Design and Cloning of Mammalian Constructs for RNA Editing

RanCas13b was made catalytically inactive (dRanCas13b) via histidine toalanine and arginine to alanine mutations (R142A/H147A/R1039A/H1044A) atthe catalytic site of the HEPN domains. The deaminase domain and ADAR2were synthesized and PCR amplified for Gibson cloning into pcDNA-CMVvector backbones and were fused to dRanCas13b at the C-terminus via aGS-mapkNES-GS (GSSLQKKLEELELGS (SEQ ID NO:309)) linker. Mutations in theADAR2 deaminase domain for altering cytosine deamination activity orspecificity were introduced by Gibson cloning into thedRanCas13b-GS-mapkNES-GS-ADAR2dd backbone. All mutations introduced intoADAR2dd for evolving C to U editing are listed in Table 25.

For comparison between different Cas13b orthologs, mutations tested onthe dRanCas13b backbone were transferred to a dPspCas13b fusion vectorby Gibson cloning onto the REPAIR construct,dPspCas13b-GS-HIVNES-GS-ADAR2dd. For testing the ADAR2dd alone withoutdRanCas13b and the full length ADAR2, Applicants used Gibson cloning toadd all mutations to pcDNA-CMV vector backbones with ADAR2dd or fulllength ADAR2, previously cloned to test REPAIR.

Luciferase reporter vectors for measuring C to U RNA editing activitywere generated by screening potential mutations in Gluc in thepreviously reported luciferase reporter plasmid. This reporter vectorexpresses functional Cluc as a normalization control, but a defectiveGluc due to the addition of mutants (either C82R or L77P). To testRESCUE editing motif preferences, Applicants cloned every possible motifaround the cytosine at codon 82 (AAX CXC) of Gluc. Secreted luciferasereporter vectors for testing CTNNB1 editing efficiency were generatedfrom M50 Super 8× TOPFlash (Addgene #12456) and M50 Super 8× FOPFlash(Addgene #12457). The original firefly luciferase, under control ofeither TCF/LEF responsive elements (TOPFlash) or mock binding sites(FOPFlash) was replaced with a secreted Gaussia luciferase via Gibsoncloning. An additional Cypridina luciferase with expression drive by aCMV promoter was cloned in to serve as a transfection control. Allmammalian plasmids are listed in Table 22.

Selection of RESCUE Versions in Mammalian Cells

Mutations that performed comparable or better to the existing version ofRESCUE were selected for screening on the entire panel of 6 luciferasereporters. For the selection of RESCUE v4 through v10, candidatemutations were initially screened on TCG motifs; RESCUE v11 was isolatedusing GCG motifs as the initial screening. Selection of RESCUE v12through v14 were validated in mammalian cells using an initial screeningon editing of the T41I residue of endogenous CTNNB1, resulting inbeta-catenin pathway activation that was profiled with luminescentreporters of pathway activity, and RESCUE v15 and v16 were selected viaactivity on the L77P CCT motif of Gluc. All rounds and yeast screensused to generate them are listed in Table 25.

Cloning Pathogenic U>C Mutations for Assaying RESCUE Activity

To generate disease-relevant mutations for testing REPAIR activity, 23U>C mutations related to disease pathogenesis, as defined in ClinVar,were selected (grouped as a panel of 22 genes and ApoE independently).Selected targets were ordered from Integrated DNA Technologies as 200-bpregions surrounding the mutation site, and were cloned downstream ofmScarlet under a Eflalpha promoter.

Guide Cloning for RESCUE

For expression of mammalian guide RNAs for RESCUE, a previouslydescribed construct with a RanCas13b direct repeat sequence preceded bygolden-gate acceptor sites under U6 expression was used. Individualguides were cloned into this expression backbone by golden-gate cloning.To determine optimal guides for select sites, both C and U flips weretested, as well as tiling guides around the most common optimal guiderange (mismatch distance of ˜24).

Guide sequences for RESCUE experiments, all yeast plasmids, and alltargeting guides used in yeast experiments are listed in Tables 21-26.

TABLE 21 Guide sequences used for luciferase editing Base flip/spacerTargeted length/ Codon Name gene Motif position change Spacer sequenceNotes UCG Gluc UCG C/30/26 C82R gugcCauugaugugggaca No 5′ G targetingggcagaucaga (SEQ guide ID NO: 310) GCG Gluc GCG U/30/20 C82RGuugggcgugcucuugaug targeting ugggacaggcag (SEQ guide ID NO: 311) ACGGluc ACG C/30/28 C82R Ggccuuugaugugggacag targeting gcagaucagaca (SEQguide ID NO: 312) CCG Gluc CCG C/30/26 C82R Gugccguugaugugggaca No 5′ Gtargeting ggcagaucaga (SEQ guide ID NO: 313) CCU Gluc CCU C/30/26 L77PGggaacggcagaucagaca targeting gccccuggugca (SEQ ID NO: 314) CCA Gluc CCAC/30/26 L77P Gggauuggcagaucagaca targeting gccccuggugca (SEQ ID NO: 315)Motif guide Gluc UCU U/30/26 L82F gugaUauugaugugggaca UCU, flip Uggcagaucaga (SEQ ID NO: 316) Motif guide Gluc UCG U/30/26 C82RgugcUauugaugugggaca UCG, flip U ggcagaucaga (SEQ ID NO: 317) Motif guideGluc UCC U/30/26 P82S guggUauugaugugggaca UCC, flip U ggcagaucaga (SEQID NO: 318) Motif guide Gluc UCA U/30/26 H82Y guguUauugaugugggacaUCA, flip U ggcagaucaga (SEQ ID NO: 319) Motif guide Gluc ACU U/30/26L82F gugaUuuugaugugggaca ACU, flip U ggcagaucaga (SEQ ID NO: 320)Motif guide Gluc ACG U/30/26 C82R gugcUuuugaugugggaca ACG, flip Uggcagaucaga (SEQ ID NO: 321) Motif guide Gluc ACC U/30/26 P82SguggUuuugaugugggaca ACC, flip U ggcagaucaga (SEQ ID NO: 322) Motif guideGluc ACA U/30/26 H82Y guguUuuugaugugggaca ACA, flip U ggcagaucaga (SEQID NO: 323) Motif guide Gluc GCU U/30/26 L82F gugaUcuugaugugggacaGCU, flip U ggcagaucaga (SEQ ID NO: 324) Motif guide Gluc GCG U/30/26C82R gugcUcuugaugugggaca GCG, flip U ggcagaucaga (SEQ ID NO: 325)Motif guide Gluc GCC U/30/26 P82S guggUcuugaugugggaca GCC, flip Uggcagaucaga (SEQ ID NO: 326) Motif guide Gluc GCA U/30/26 H82YguguUcuugaugugggaca GCA, flip U ggcagaucaga (SEQ ID NO: 327) Motif guideGluc CCU U/30/26 L82F gugaUguugaugugggaca CCU, flip U ggcagaucaga (SEQID NO: 328) Motif guide Gluc CCG U/30/26 C82R gugcUguugaugugggacaCCG, flip U ggcagaucaga (SEQ ID NO: 329) Motif guide Gluc CCC U/30/26P82S guggUguugaugugggaca CCC, flip U ggcagaucaga (SEQ ID NO: 330)Motif guide Gluc CCA U/30/26 H82Y guguUguugaugugggaca CCA, flip Uggcagaucaga (SEQ ID NO: 331) Motif guide Gluc UCU C/30/26 L82FgugaCauugaugugggaca UCU, flip C ggcagaucaga (SEQ ID NO: 332) Motif guideGluc UCC C/30/26 P82S guggCauugaugugggaca UCC, flip C ggcagaucaga (SEQID NO: 333) Motif guide Gluc UCA C/30/26 H82Y guguCauugaugugggacaUCA, flip C ggcagaucaga (SEQ ID NO: 334) Motif guide Gluc ACU C/30/26L82F gugaCuuugaugugggaca ACU, flip C ggcagaucaga (SEQ ID NO: 335)Motif guide Gluc ACG C/30/26 C82R gugcCuuugaugugggaca ACG, flip Cggcagaucaga (SEQ ID NO: 336) Motif guide Gluc ACC C/30/26 P82SguggCuuugaugugggaca ACC, flip C ggcagaucaga (SEQ ID NO: 337) Motif guideGluc ACA C/30/26 H82Y guguCuuugaugugggaca ACA, flip C ggcagaucaga (SEQID NO: 338) Motif guide Gluc GCU C/30/26 L82F gugaCcuugaugugggacaGCU, flip C ggcagaucaga (SEQ ID NO: 339) Motif guide Gluc GCG C/30/26C82R gugcCcuugaugugggaca GCG, flip C ggcagaucaga (SEQ ID NO: 340)Motif guide Gluc GCC C/30/26 P82S guggCcuugaugugggaca GCC, flip Cggcagaucaga (SEQ ID NO: 341) Motif guide Gluc GCA C/30/26 H82YguguCcuugaugugggaca GCA, flip C ggcagaucaga (SEQ ID NO: 342) Motif guideGluc CCU C/30/26 L82F gugaCguugaugugggaca CCU, flip C ggcagaucaga (SEQID NO: 343) Motif guide Gluc CCC C/30/26 P82S guggCguugaugugggacaCCC, flip C ggcagaucaga (SEQ ID NO: 344) Motif guide Gluc CCA C/30/26H82Y guguCguugaugugggaca CCA, flip C ggcagaucaga (SEQ ID NO: 345)Non-targeting N/A N/A N/A N/A Guaaugccuggcuugucga guide cgcauagucug (SEQID NO: 346) Gluc Gluc UCG C/30/26 C82R ggugcuaGugaugugggac Additionalspecificity aggcagaucaga (SEQ G added guide with ID NO: 347) specificityoff-target A- G mismatch 1 Gluc Gluc UCG C/30/26 C82RggugcuauGgaugugggac Additional specificity aggcagaucaga (SEQ G addedguide with ID NO: 348) specificity off-target A- G mismatch 2 Gluc GlucUCG C/30/26 C82R ggugcuauugaugGgggac Additional specificityaggcagaucaga (SEQ G added guide with ID NO: 349) specificityoff-target A- G mismatch 3 Gluc Gluc UCG C/30/26 C82RggugcuaGGgaugugggac Additional specificity aggcagaucaga (SEQ G addedguide with D NO: 350) specificity off-target A- G combo 1 + 2 Gluc GlucUCG C/30/26 C82R ggugcuaGGgaugGgggac Additional specificityaggcagaucaga (SEQ G added guide with ID NO: 351) specificityoff-target A- G combo all A to I Cluc TAG C/50/34 *85wGcgcccugugcggacuccu REPAIR ugucgccuucguaggugug guide gcagcguccuggg(SEQ ID NO: 352) Tiling guide Gluc UCG U/30/30 C82R Guauugaugugggacaggc30 flip 30 U agaucagacagc (SEQ ID NO: 353) Tiling guide Gluc UCG U/30/28C82R Ggcuauugaugugggacag 30 flip 28 U gcagaucagaca (SEQ ID NO: 354)Tiling guide Gluc UCG U/30/26 C82R Ggugcuauugaugugggac 30 flip 26 Uaggcagaucaga (SEQ ID NO: 355) Tiling guide Gluc UCG U/30/24 C82RGgcgugcuauugauguggg 30 flip 24 U acaggcagauca (SEQ ID NO: 356)Tiling guide Gluc UCG U/30/22 C82R Ggggcgugcuauugaugug 30 flip 22 Uggacaggcagau (SEQ ID NO: 357) Tiling guide Gluc UCG U/30/20 C82RGuugggcgugcuauugaug 30 flip 20 U ugggacaggcag (SEQ ID NO: 358)Tiling guide Gluc UCG U/30/18 C82R Gucuugggcgugcuauuga 30 flip 18 Uugugggacaggc (SEQ ID NO: 359) Tiling guide Gluc UCG U/30/16 C82RGcaucuugggcgugcuauu 30 flip 16 U gaugugggacag (SEQ ID NO: 360)Tiling guide Gluc UCG U/30/14 C82R Guucaucuugggcgugcua 30 flip 14 Uuugaugugggac (SEQ ID NO: 361) Tiling guide Gluc UCG U/30/12 C82RGucuucaucuugggcgugc 30 flip 12 U uauugauguggg (SEQ ID NO: 362)Tiling guide Gluc UCG U/30/10 C82R Gcuucuucaucuugggcgu 30 flip 10 Ugcuauugaugug (SEQ ID NO: 363) Tiling guide Gluc UCG U/30/8 C82RGaacuucuucaucuugggc 30 flip 8 U gugcuauugaug (SEQ ID NO: 364)Tiling guide Gluc UCG U/30/6 C82R Gugaacuucuucaucuugg 30 flip 6 Ugcgugcuauuga (SEQ ID NO: 365) Tiling guide Gluc UCG U/30/4 C82RGgaugaacuucuucaucuu 30 flip 4 U gggcgugcuauu (SEQ ID NO: 366)Tiling guide Gluc UCG U/30/2 C82R Ggggaugaacuucuucauc 30 flip 2 Uugggcgugcua (SEQ ID NO: 367) Tiling guide Gluc UCG U/50/50 C82RGuauugaugugggacaggc 50 flip 50 U agaucagacagccccuggu gcagccagcuuuc(SEQ ID NO: 368) Tiling guide Gluc UCG U/50/48 C82R Ggcuauugaugugggacag50 flip 48 U gcagaucagacagccccug gugcagccagcuu (SEQ ID NO: 369)Tiling guide Gluc UCG U/50/46 C82R Ggugcuauugaugugggac 50 flip 46 Uaggcagaucagacagcccc uggugcagccagc (SEQ ID NO: 370) Tiling guide Gluc UCGU/50/44 C82R Ggcgugcuauugauguggg 50 flip 44 U acaggcagaucagacagccccuggugcagcca (SEQ ID NO: 371) Tiling guide Gluc UCG U/50/42 C82RGgggcgugcuauugaugug 50 flip 42 U ggacaggcagaucagacag ccccuggugcagc(SEQ ID NO: 372) Tiling guide Gluc UCG U/50/40 C82R Guugggcgugcuauugaug50 flip 40 U ugggacaggcagaucagac agccccuggugca (SEQ ID NO: 373)Tiling guide Gluc UCG U/50/38 C82R Gucuugggcgugcuauuga 50 flip 38 Uugugggacaggcagaucag acagccccuggug (SEQ ID NO: 374)

TABLE 22 Guide sequences used for endogenous gene editing Base Targetedflip/ Codon Name gene Motif position change Spacer sequenceS33F_CTNNB1_30bp_guide_30_9 CTNNB1 UCU 22 S33F GGGAUUCCACAGUCCA C flipGGUAAGACUGUUGCU (SEQ ID NO: 375) H36Y_CTNNB1_30bp_guide_30_9 CTNNB1 CCA22 H36Y GACCAGAAUUGAUUCC T flip AGAGUCCAGGUAAGA (SEQ ID NO: 376)S37F_CTNNB1_30bp_guide_30_9 CTNNB1 UCU 22 537F GUGGCACCAUAAUGGA T flipUUCCAGAGUCCAGGU (SEQ ID NO: 377) T41I_CTNNB1_30bp_guide_30_11 CTNNB1 ACC20 T41I GAGGAGCUGUGUUAGU T flip GGCACCAGAAUGGAU (SEQ ID NO: 378)P44L_CTNNB1_30bp_guide_30_9 CTNNB1 CCU 22 P44L GUCAGAGAACGAGCUG C flipUGGUAGUGGCACCAG (SEQ ID NO: 379) P44S_CTNNB1_30bp_guide_30_11 CTNNB1 UCU20 P44S GCUCAGAGAAGUAGCU T flip GUGGUAGUGGCACCA (SEQ ID NO: 380)S45F_CTNNB1_30bp_guide_30_11 CTNNB1 UCU 20 S45F GACCACUCAGACAAGG C flipAGCUGUGGUAGUGGC (SEQ ID NO: 381) TCG_KRAS_30bp_guide_30_7 KRAS UCG 24L56L GUGUGUCUAGAAUAUC T flip CAAGAGACAGGUUUC (SEQ ID NO: 382)ACG_KRAS_30bp_guide_30_11 KRAS ACG 20 D30D GGAUCAUAUUCCUCCA C flipCAAAAUGAUUCUGAA (SEQ ID NO: 383) GCG_KRAS_30bp_guide_30_11 KRAS GCG 20G13G GUCUUGCCUACUCCAC T flip CAGCUCCAACUACCA (SEQ ID NO: 384)CCT_KRAS_30bp_guide_30_11 KRAS CCU 20 A18A GGUAUCGUCAACGCAC C flipUCUUGCCUACGCCAC (SEQ ID NO: 385) TCG_PPIB_30bp_guide_30_11 PPIB UCG 20I18I GCGGACCCCGCUAUGA T flip GGGCGGCGGCAAGGA (SEQ ID NO: 386)ACG_PPIB_30bp_guide_30_7 PPIB ACG 24 R7C GAUAUUCCUCCACAAA C flipAUGAUUCUGAAUUAG (SEQ ID NO: 387) GCG_PPIB_30bp_guide_30_11 PPIB GCG 20A19V GGACGGACCCCUCGAU T flip GAGGGCGGCGGCAAG (SEQ ID NO: 388)CCG_PPIB_30bp_guide_30_11 PPIB CCG 20 S21S GGGAAGAAGACCGACC C flipCCGCGAUGAGGGCGG (SEQ ID NO: 389) TCG_SMARCA4_30bp_guide_30_9 SMARCA4 UCG22 S85L GGGUCGUCCUACAUGC T flip CCUUCUCAUGCAUGG (SEQ ID NO: 390)ACG_SMARCA4_30bp_guide_30_11 SMARCA4 ACG 20 D86D GAGCGCGGGUCUUCCG T flipACAUGCCCUUCUCAU (SEQ ID NO: 391) GCG_SMARCA4_30bp_guide_30_11 SMARCA4GCG 20 R89C GUGGUUGUAGCCCGGG C flip UCGUCCGACAUGCCC (SEQ ID NO: 392)CCG_SMARCA4_30bp_guide_30_11 SMARCA4 CCG 20 P88L GGUUGUAGCGCUGGUC T flipGUCCGACAUGCCCUU (SEQ ID NO: 393) NRAS_C- NRAS UCC 20 I211GGGAUUAGCUGCAUUG flip_guide_30_11 UCAGUGCGCUUUUCC (SEQ ID NO: 394)NKFB1_T- NFKB ACC 20 P33S GGCCAUCUGUGUUUGA flip_guide_30_11AAUACUUCUGGAUUA (SEQ ID NO: 395) EZH2_T- EZH2 UCA 20 F32FGCAGCUCGUCUUAACC flip_guide_30_11 UCUUGAGCUGUCUCA (SEQ ID NO: 396)NF2_T- NF2 ACG 24 T21M GGUGAACUUCUUGGGU flip_guide_30_7 UGCUUCCUCUUGAGA(SEQ ID NO: 397) RAF1_T- RAF1 UCC 24 P30S GUUGUAGUAGAGAUGCflip_guide_30_7 AGCUGGAGCCAUCAA (SEQ ID NO: 398) STAT3_Y705C- STAT3 UAC34 Y705C GAAACUUGGUCUUCAG flip_50_17 GCAUGGGGCAGCGCUA CCUGGGUCAGCUUCAGGAU (SEQ ID NO: 399) STAT3_S727C- STAT3 UCC 22 S727F GUGCGGGGGCACAUCGflip_30_9 GCAGGUCAAUGGUAU (SEQ ID NO: 400) STAT1_Y701C- STAT1 UAU 34Y705C GCAACUCAGUCUUGAU flip_50_17 ACAUCCAGUUCCUUUA GGGCCAUCAAGUUCCAUUG (SEQ ID NO: 401) STAT1_S727C- STAT1 UCC 22 S727F GCCUCAGGACACAUGGflip_30_9 GGAGCAGGUUGUCUG (SEQ ID NO: 402) S33F_CTNNB1_30bp_guide_U-CTNNB1 UCU 24 S33F GAUUCCAUAGUCCAGG flip_30_7 UAAGACUGUUGCUGC (SEQ IDNO: 403) S33F_CTNNB1_30bp_guide_U- CTNNB1 UCU 22 S33F GGGAUUCCAUAGUCCAflip_30_9 GGUAAGACUGUUGCU (SEQ ID NO: 404) S33F_CTNNB1_30bp_guide_U-CTNNB1 UCU 20 S33F GAUGGAUUCCAUAGUC flip_30_11 CAGGUAAGACUGUUG (SEQ IDNO: 405) S33F_CTNNB1_30bp_guide_U- CTNNB1 UCU 18 S33F GGAAUGGAUUCCAUAGflip_30_13 UCCAGGUAAGACUGU (SEQ ID NO: 406) H36Y_CTNNB1_30bp_guide_U-CTNNB1 CCA 24 H36Y GCAGAAUUGAUUCCAG flip_30_7 AGUCCAGGUAAGACU (SEQ IDNO: 407) H36Y_CTNNB1_30bp_guide_U- CTNNB1 CCA 22 H36Y GACCAGAAUUGAUUCCflip_30_9 AGAGUCCAGGUAAGA (SEQ ID NO: 408) H36Y_CTNNB1_30bp_guide_U-CTNNB1 CCA 20 H36Y GGCACCAGAAUUGAUU flip_30_11 CCAGAGUCCAGGUAA (SEQ IDNO: 409) H36Y_CTNNB1_30bp_guide_U- CTNNB1 CCA 18 H36Y GUGGCACCAGAAUUGAflip_30_13 UUCCAGAGUCCAGGU (SEQ ID NO: 410) S37F_CTNNB1_30bp_guide_U-CTNNB1 UCU 24 S37F GGCACCAUAAUGGAUU flip_30_7 CCAGAGUCCAGGUAA (SEQ IDNO: 411) S37F_CTNNB1_30bp_guide_U- CTNNB1 UCU 22 S37F GUGGCACCAUAAUGGAflip_30_9 UUCCAGAGUCCAGGU (SEQ ID NO: 412) S37F_CTNNB1_30bp_guide_U-CTNNB1 UCU 20 S37F GAGUGGCACCAUAAUG flip_30_11 GAUUCCAGAGUCCAG (SEQ IDNO: 413) S37F_CTNNB1_30bp_guide_U- CTNNB1 UCU 18 S37F GGUAGUGGCACCAUAAflip_30_13 UGGAUUCCAGAGUCC (SEQ ID NO: 414) T41I_CTNNB1_30bp_guide_U-CTNNB1 ACC 24 T41I GGCUGUGUUAGUGGCA flip_30_7 CCAGAAUGGAUUCCA (SEQ IDNO: 415) T41I_CTNNB1_30bp_guide_U- CTNNB1 ACC 22 T41I GGAGCUGUGUUAGUGGflip_30_9 CACCAGAAUGGAUUC (SEQ ID NO: 416) T41I_CTNNB1_30bp_guide_U-CTNNB1 ACC 20 T41I GAGGAGCUGUGUUAGU flip_30_11 GGCACCAGAAUGGAU (SEQ IDNO: 417) T41I_CTNNB1_30bp_guide_U- CTNNB1 ACC 18 T41I GGAAGGAGCUGUGUUAflip_30_13 GUGGCACCAGAAUGG (SEQ ID NO: 418) P44L_CTNNB1_30bp_guide_U-CTNNB1 CCU 24 P44L GAGAGAAUGAGCUGUG flip_30_7 GUAGUGGCACCAGAA (SEQ IDNO: 419) P44L_CTNNB1_30bp_guide_U- CTNNB1 CCU 22 P44L GUCAGAGAAUGAGCUGflip_30_9 UGGUAGUGGCACCAG (SEQ ID NO: 420) P44L_CTNNB1_30bp_guide_U-CTNNB1 CCU 20 P44L GACUCAGAGAAUGAGC flip_30_11 UGUGGUAGUGGCACC (SEQ IDNO: 421) P44L_CTNNB1_30bp_guide_U- CTNNB1 CCU 18 P44L GCCACUCAGAGAAUGAflip_30_13 GCUGUGGUAGUGGCA (SEQ ID NO: 422) P44S_CTNNB1_30bp_guide_U-CTNNB1 UCU 24 P44S GGAGAAGUAGCUGUGG flip_30_7 UAGUGGCACCAGAAU (SEQ IDNO: 423) P44S_CTNNB1_30bp_guide_U- CTNNB1 UCU 22 P44S GCAGAGAAGUAGCUGUflip_30_9 GGUAGUGGCACCAGA (SEQ ID NO: 424) P44S_CTNNB1_30bp_guide_U-CTNNB1 UCU 20 P44S GCUCAGAGAAGUAGCU flip_30_11 GUGGUAGUGGCACCA (SEQ IDNO: 425) P44S_CTNNB1_30bp_guide_U- CTNNB1 UCU 18 P44S GCACUCAGAGAAGUAGflip_30_13 CUGUGGUAGUGGCAC (SEQ ID NO: 426) S45F_CTNNB1_30bp_guide_U-CTNNB1 UCU 24 S45F GCUCAGAUAAGGAGCU flip_30_7 GUGGUAGUGGCACCA (SEQ IDNO: 427) S45F_CTNNB1_30bp_guide_U- CTNNB1 UCU 22 S45F GCACUCAGAUAAGGAGflip_30_9 CUGUGGUAGUGGCAC (SEQ ID NO: 428) S45F_CTNNB1_30bp_guide_U-CTNNB1 UCU 20 S45F GACCACUCAGAUAAGG flip_30_11 AGCUGUGGUAGUGGC (SEQ IDNO: 429) S45F_CTNNB1_30bp_guide_U- CTNNB1 UCU 18 S45F GUUACCACUCAGAUAAflip_30_13 GGAGCUGUGGUAGUG (SEQ ID NO: 430) TCG_KRAS_30bp_guide_U- KRASUCG 24 L56L GUGUGUCUAGAAUAUC flip_30_7 CAAGAGACAGGUUUC (SEQ ID NO: 431)TCG_KRAS_30bp_guide_U- KRAS UCG 22 L56L GGCUGUGUCUAGAAUA flip_30_9UCCAAGAGACAGGUU (SEQ ID NO: 432) TCG_KRAS_30bp_guide_U- KRAS UCG 20 L56LGCUGCUGUGUCUAGAA flip_30_11 UAUCCAAGAGACAGG (SEQ ID NO: 433)TCG_KRAS_30bp_guide_U- KRAS UCG 18 L56L GACCUGCUGUGUCUAG flip_30_13AAUAUCCAAGAGACA (SEQ ID NO: 434) ACG_KRAS_30bp_guide_U- KRAS ACG 24 D30DGAUAUUCUUCCACAAA flip_30_7 AUGAUUCUGAAUUAG (SEQ ID NO: 435)ACG_KRAS_30bp_guide_U- KRAS ACG 22 D30D GUCAUAUUCUUCCACA flip_30_9AAAUGAUUCUGAAUU (SEQ ID NO: 436) ACG_KRAS_30bp_guide_U- KRAS ACG 20 D30DGGAUCAUAUUCUUCCA flip_30_11 CAAAAUGAUUCUGAA (SEQ ID NO: 437)ACG_KRAS_30bp_guide_U- KRAS ACG 18 D30D GUGGAUCAUAUUCUUC flip_30_13CACAAAAUGAUUCUG (SEQ ID NO: 438) GCG_KRAS_30bp_guide_U- KRAS GCG 24 G13GGGCCUACUCCACCAGC flip_30_7 UCCAACUACCACAAG (SEQ ID NO: 439)GCG_KRAS_30bp_guide_U- KRAS GCG 22 G13G GUUGCCUACUCCACCA flip_30_9GCUCCAACUACCACA (SEQ ID NO: 440) GCG_KRAS_30bp_guide_U- KRAS GCG 20 G13GGUCUUGCCUACUCCAC flip_30_11 CAGCUCCAACUACCA (SEQ ID NO: 441)GCG_KRAS_30bp_guide_U- KRAS GCG 18 G13G GACUCUUGCCUACUCC flip_30_13ACCAGCUCCAACUAC (SEQ ID NO: 442) CCT_KRAS_30bp_guide_U- KRAS CCU 24 A18AGCGUCAAUGCACUCUU flip_30_7 GCCUACGCCACCAGC (SEQ ID NO: 443)CCT_KRAS_30bp_guide_U- KRAS CCU 22 A18A GAUCGUCAAUGCACUC flip_30_9UUGCCUACGCCACCA (SEQ ID NO: 444) CCT_KRAS_30bp_guide_U- KRAS CCU 20 A18AGGUAUCGUCAAUGCAC flip_30_11 UCUUGCCUACGCCAC (SEQ ID NO: 445)CCT_KRAS_30bp_guide_U- KRAS CCU 18 A18A GCUGUAUCGUCAAUGC flip_30_13ACUCUUGCCUACGCC (SEQ ID NO: 446) TCG_PPIB_30bp_guide_U- PPIB UCG 24 I18IGCCCCGCUAUGAGGGC flip_30_7 GGCGGCAAGGAGCAC (SEQ ID NO: 447)TCG_PPIB30bp_guide_U- PPIB UCG 22 I18I GGACCCCGCUAUGAGG flip_30_9GCGGCGGCAAGGAGC (SEQ ID NO: 448) TCG_PPIB_30bp_guide_U- PPIB UCG 20 I18IGCGGACCCCGCUAUGA flip_30_11 GGGCGGCGGCAAGGA (SEQ ID NO: 449)TCG_PPIB_30bp_guide_U- PPIB UCG 18 I181 GGACGGACCCCGCUAU flip_30_13GAGGGCGGCGGCAAG (SEQ ID NO: 450) ACG_PPIB_30bp_guide_U- PPIB ACG 24 R7CGUGUUGCUUUCGGAGA flip_30_7 GGCGCAGCAUCCACA (SEQ ID NO: 451)ACG_PPIB_30bp_guide_U- PPIB ACG 22 R7C GCAUGUUGCUUUCGGA flip_30_9GAGGCGCAGCAUCCA (SEQ ID NO: 452) ACG_PPIB_30bp_guide_U- PPIB ACG 20 R7CGUUCAUGUUGCUUUCG flip_30_11 GAGAGGCGCAGCAUC (SEQ ID NO: 453)ACG_PPIB_30bp_guide_U- PPIB ACG 18 R7C GCCUUCAUGUUGCUUU flip_30_13CGGAGAGGCGCAGCA (SEQ ID NO: 454) GCG_PPIB_30bp_guide_U- PPIB GCG 24 A19VGGACCCCUCGAUGAGG flip_30_7 GCGGCGGCAAGGAGC (SEQ ID NO: 455)GCG_PPIB_30bp_guide_U- PPIB GCG 22 A19V GCGGACCCCUCGAUGA flip_30_9GGGCGGCGGCAAGGA (SEQ ID NO: 456) GCG_PPIB_30bp_guide_U- PPIB GCG 20 A19VGGACGGACCCCUCGAU flip_30_11 GAGGGCGGCGGCAAG (SEQ ID NO: 457)GCG_PPIB_30bp_guide_U- PPIB GCG 18 A19V GAAGACGGACCCCUCG flip_30_13AUGAGGGCGGCGGCA (SEQ ID NO: 458) CCG_PPIB_30bp_guide_U- PPIB CCG 24 S21SGGAAGACUGACCCCGC flip_30_7 GAUGAGGGCGGCGGC (SEQ ID NO: 459)CCG_PPIB_30bp_guide_U- PPIB CCG 22 S21S GAAGAAGACUGACCCC flip_30_9GCGAUGAGGGCGGCG (SEQ ID NO: 460) CCG_PPIB_30bp_guide_U- PPIB CCG 20 S21SGGGAAGAAGACUGACC flip_30_11 CCGCGAUGAGGGCGG (SEQ ID NO: 461)CCG_PPIB_30bp_guide_U- PPIB CCG 18 S21S GCAGGAAGAAGACUGA flip_30_13CCCCGCGAUGAGGGC (SEQ ID NO: 462) TCG_SMARCA4_30bp_guide_U- SMARCA4 UCG24 S85L GUCGUCCUACAUGCCC flip_30_7 UUCUCAUGCAUGGAC (SEQ ID NO: 463)TCG_SMARCA4_30bp_guide_U- SMARCA4 UCG 22 S85L GGGUCGUCCUACAUGC flip_30_9CCUUCUCAUGCAUGG (SEQ ID NO: 464) TCG_SMARCA4_30bp_guide_U- SMARCA4 UCG20 S85L GCGGGUCGUCCUACAU flip_30_11 GCCCUUCUCAUGCAU (SEQ ID NO: 465)TCG_SMARCA4_30bp_guide_U- SMARCA4 UCG 18 S85L GCGCGGGUCGUCCUACflip_30_13 AUGCCCUUCUCAUGC (SEQ ID NO: 466) ACG_SMARCA4_30bp_guide_U-SMARCA4 ACG 24 D86D GCGGGUCUUCCGACAU flip_30_7 GCCCUUCUCAUGCAU (SEQ IDNO: 467) ACG_SMARCA4_30bp_guide_U- SMARCA4 ACG 22 D86D GCGCGGGUCUUCCGACflip_30_9 AUGCCCUUCUCAUGC (SEQ ID NO: 468) ACG_SMARCA4_30bp_guide_U-SMARCA4 ACG 20 D86D GAGCGCGGGUCUUCCG flip_30_11 ACAUGCCCUUCUCAU (SEQ IDNO: 469) ACG_SMARCA4_30bp_guide_U- SMARCA4 ACG 18 D86D GGUAGCGCGGGUCUUCflip_30_13 CGACAUGCCCUUCUC (SEQ ID NO: 470) GCG_SMARCA4_30bp_guide_U-SMARCA4 GCG 24 R89C GUGUAGCUCGGGUCGU flip_30_7 CCGACAUGCCCUUCU (SEQ IDNO: 471) GCG_SMARCA4_30bp_guide_U- SMARCA4 GCG 22 R89C GGUUGUAGCUCGGGUCflip_30_9 GUCCGACAUGCCCUU (SEQ ID NO: 472) GCG_SMARCA4_30bp_guide_U-SMARCA4 GCG 20 R89C GUGGUUGUAGCUCGGG flip_30_11 UCGUCCGACAUGCCC (SEQ IDNO: 473) GCG_SMARCA4_30bp_guide_U- SMARCA4 GCG 18 R89C GUCUGGUUGUAGCUCGflip_30_13 GGUCGUCCGACAUGC (SEQ ID NO: 474) CCG_SMARCA4_30bp_guide_U-SMARCA4 CCG 24 P88L GUAGCGCUGGUCGUCC flip_30_7 GACAUGCCCUUCUCA (SEQ IDNO: 475) CCG_SMARCA4_30bp_guide_U- SMARCA4 CCG 22 P88L GUGUAGCGCUGGUCGUflip_30_9 CCGACAUGCCCUUCU (SEQ ID NO: 476) CCG_SMARCA4_30bp_guide_U-SMARCA4 CCG 20 P88L GGUUGUAGCGCUGGUC flip_30_11 GUCCGACAUGCCCUU (SEQ IDNO: 477) CCG_SMARCA4_30bp_guide_U- SMARCA4 CCG 18 P88L GUGGUUGUAGCGCUGGflip_30_13 UCGUCCGACAUGCCC (SEQ ID NO: 478) S33F_CTNNB1_30bp_C- CTNNB1UCU 24 S33F GAUUCCACAGUCCAGG flip_guide_30_7 UAAGACUGUUGCUGC (SEQ IDNO: 479) S33F_CTNNB1_30bp_C- CTNNB1 UCU 22 S33F GGGAUUCCACAGUCCAflip_guide_30_9 GGUAAGACUGUUGCU (SEQ ID NO: 480) S33F_CTNNB1_30bp_C-CTNNB1 UCU 20 S33F GAUGGAUUCCACAGUC flip_guide_30_11 CAGGUAAGACUGUUG(SEQ ID NO: 481) S33F_CTNNB1_30bp_C- CTNNB1 UCU 18 S33F GGAAUGGAUUCCACAGflip_guide_30_13 UCCAGGUAAGACUGU (SEQ ID NO: 482) H36Y_CTNNB1_30bp_C-CTNNB1 CCA 24 H36Y GCAGAAUCGAUUCCAG flip_guide_30_7 AGUCCAGGUAAGACU(SEQ ID NO: 483) H36Y_CTNNB1_30bp_C- CTNNB1 CCA 22 H36Y GACCAGAAUCGAUUCCflip_guide_30_9 AGAGUCCAGGUAAGA (SEQ ID NO: 484) H36Y_CTNNB1_30bp_C-CTNNB1 CCA 20 H36Y GGCACCAGAAUCGAUU flip_guide_30_11 CCAGAGUCCAGGUAA(SEQ ID NO: 485) H36Y_CTNNB1_30bp_C CTNNB1 CCA 18 H36Y GUGGCACCAGAAUCGAflip_guide_30_13 UUCCAGAGUCCAGGU (SEQ ID NO: 486) S37F_CTNNB1_30bp_C-CTNNB1 UCU 24 S37F GGCACCACAAUGGAUU flip_guide_30_7 CCAGAGUCCAGGUAA(SEQ ID NO: 487) S37F_CTNNB1_30bp_C- CTNNB1 UCU 22 S37F GUGGCACCACAAUGGAflip_guide_30_9 UUCCAGAGUCCAGGU (SEQ ID NO: 488) S37F_CTNNB1_30bp_C-CTNNB1 UCU 20 S37F GAGUGGCACCACAAUG flip_guide_30_11 GAUUCCAGAGUCCAG(SEQ ID NO: 489) S37F_CTNNB1_30bp_C- CTNNB1 UCU 18 S37F GGUAGUGGCACCACAAflip_guide_30_13 UGGAUUCCAGAGUCC (SEQ ID NO: 490) T41I_CTNNB1_30bp_C-CTNNB1 ACC 24 T41I GGCUGUGCUAGUGGCA flip_guide_30_7 CCAGAAUGGAUUCCA(SEQ ID NO: 491) T41I_CTNNB1_30bp_C- CTNNB1 ACC 22 T41I GGAGCUGUGCUAGUGGflip_guide_30_9 CACCAGAAUGGAUUC (SEQ ID NO: 492) T41I_CTNNB1_30bp_C-CTNNB1 ACC 20 T41I GAGGAGCUGUGCUAGU flip_guide_30_11 GGCACCAGAAUGGAU(SEQ ID NO: 493) T41I_CTNNB1_30bp_C- CTNNB1 ACC 18 T41I GGAAGGAGCUGUGCUAflip_guide_30_13 GUGGCACCAGAAUGG (SEQ ID NO: 494) P44L_CTNNB1_30bp_C-CTNNB1 CCU 24 P44L GAGAGAACGAGCUGUG flip_guide_30_7 GUAGUGGCACCAGAA(SEQ ID NO: 495) P44L_CTNNB1_30bp_C- CTNNB1 CCU 22 P44L GUCAGAGAACGAGCUGflip_guide_30_9 UGGUAGUGGCACCAG (SEQ ID NO: 496) P44L_CTNNB1_30bp_C-CTNNB1 CCU 20 P44L GACUCAGAGAACGAGC flip_guide_30_11 UGUGGUAGUGGCACC(SEQ ID NO: 497) P44L_CTNNB1_30bp_C- CTNNB1 CCU 18 P44L GCCACUCAGAGAACGAflip_guide_30_13 GCUGUGGUAGUGGCA (SEQ ID NO: 498) P44S_CTNNB1_30bp_C-CTNNB1 UCU 24 P44S GGAGAAGCAGCUGUGG flip_guide_30_7 UAGUGGCACCAGAAU(SEQ ID NO: 499) P44S_CTNNB1_30bp_C- CTNNB1 UCU 22 P44S GCAGAGAAGCAGCUGUflip_guide_30_9 GGUAGUGGCACCAGA (SEQ ID NO: 500) P44S_CTNNB1_30bp_C-CTNNB1 UCU 20 P44S GCUCAGAGAAGCAGCU flip_guide_30_11 GUGGUAGUGGCACCA(SEQ ID NO: 501) P44S_CTNNB1_30bp_C- CTNNB1 UCU 18 P44S GCACUCAGAGAAGCAGflip_guide_30_13 CUGUGGUAGUGGCAC (SEQ ID NO: 502) S45F_CTNNB1_30bp_C-CTNNB1 UCU 24 S45F GCUCAGACAAGGAGCU flip_guide_30_7 GUGGUAGUGGCACCA(SEQ ID NO: 503) S45F_CTNNB1_30bp_C- CTNNB1 UCU 22 S45F GCACUCAGACAAGGAGflip_guide_30_9 CUGUGGUAGUGGCAC (SEQ ID NO: 504) S45F_CTNNB1_30bp_C-CTNNB1 UCU 20 S45F GACCACUCAGACAAGG flip_guide_30_11 AGCUGUGGUAGUGGC(SEQ ID NO: 505) S45F_CTNNB1_30bp_C- CTNNB1 UCU 18 S45F GUUACCACUCAGACAAflip_guide_30_13 GGAGCUGUGGUAGUG (SEQ ID NO: 506) TCG_KRAS_30bp_C- KRASUCG 24 L56L GUGUGUCCAGAAUAUC flip_guide_30_7 CAAGAGACAGGUUUC (SEQ IDNO: 507) TCG_KRAS_30bp_C- KRAS UCG 22 L56L GGCUGUGUCCAGAAUAflip_guide_30_9 UCCAAGAGACAGGUU (SEQ ID NO: 508) TCG_KRAS_30bp_C- KRASUCG 20 L56L GCUGCUGUGUCCAGAA flip_guide_30_11 UAUCCAAGAGACAGG (SEQ IDNO: 509) TCG_KRAS_30bp_C- KRAS UCG 18 L56L GACCUGCUGUGUCCAGflip_guide_30_13 AAUAUCCAAGAGACA (SEQ ID NO: 510) ACG_KRAS_30bp_C- KRASACG 24 D30D GAUAUUCCUCCACAAA flip_guide_30_7 AUGAUUCUGAAUUAG (SEQ IDNO: 511) ACG_KRAS_30bp_C- KRAS ACG 22 D30D GUCAUAUUCCUCCACAflip_guide_30_9 AAAUGAUUCUGAAUU (SEQ ID NO: 512) ACG_KRAS_30bp_C- KRASACG 20 D30D GGAUCAUAUUCCUCCA flip_guide_30_11 CAAAAUGAUUCUGAA (SEQ IDNO: 513) ACG_KRAS_30bp_C- KRAS ACG 18 D30D GUGGAUCAUAUUCCUCflip_guide_30_13 CACAAAAUGAUUCUG (SEQ ID NO: 514) GCG_KRAS_30bp_C- KRASGCG 24 G13G GGCCUACCCCACCAGC flip_guide_30_7 UCCAACUACCACAAG (SEQ IDNO: 515) GCG_KRAS_30bp_C- KRAS GCG 22 G13G GUUGCCUACCCCACCAflip_guide_30_9 GCUCCAACUACCACA (SEQ ID NO: 516) GCG_KRAS_30bp_C- KRASGCG 20 G13G GUCUUGCCUACCCCAC flip_guide_30_13 CAGCUCCAACUACCA (SEQ IDNO: 517) GCG_KRAS_30bp_C- KRAS GCG 18 G13G GACUCUUGCCUACCCCflip_guide_30_13 ACCAGCUCCAACUAC (SEQ ID NO: 518) CCT_KRAS_30bp_C- KRASCCU 24 A18A GCGUCAACGCACUCUU flip_guide_30_7 GCCUACGCCACCAGC (SEQ IDNO: 519) CCT_KRAS_30bp_C- KRAS CCU 22 A18A GAUCGUCAACGCACUCflip_guide_30_9 UUGCCUACGCCACCA (SEQ ID NO: 520) CCT_KRAS_30bp_C- KRASCCU 20 A18A GGUAUCGUCAACGCAC flip_guide_30_11 UCUUGCCUACGCCAC (SEQ IDNO: 521) CCT_KRAS_30bp_C- KRAS CCU 18 A18A GCUGUAUCGUCAACGCflip_guide_30_13 ACUCUUGCCUACGCC (SEQ ID NO: 522) TCG_PPIB__30bp_C- PPIBUCG 24 I181 GCCCCGCCAUGAGGGC flip_guide_30_7 GGCGGCAAGGAGCAC (SEQ IDNO: 523) TCG_PPIB_30bp_C- PPIB UCG 22 I181 GGACCCCGCCAUGAGGflip_guide_30_9 GCGGCGGCAAGGAGC (SEQ ID NO: 524) TCG_PPIB_30bp_C- PPIBUCG 20 I181 GCGGACCCCGCCAUGA flip_guide_30_11 GGGCGGCGGCAAGGA (SEQ IDNO: 525) TCG_PPIB_30bp_C- PPIB UCG 18 I181 GGACGGACCCCGCCAUflip_guide_30_13 GAGGGCGGCGGCAAG (SEQ ID NO: 526) ACG_PPIB_30bp_C- PPIBACG 24 R7C GUGUUGCCUUCGGAGA flip_guide_30_7 GGCGCAGCAUCCACA (SEQ IDNO: 527) ACG_PPIB_30bp_C- PPIB ACG 22 R7C GCAUGUUGCCUUCGGAflip_guide_30_9 GAGGCGCAGCAUCCA (SEQ ID NO: 528) ACG_PPIB_30bp_C- PPIBACG 20 R7C GUUCAUGUUGCCUUCG flip_guide_30_11 GAGAGGCGCAGCAUC (SEQ IDNO: 529) ACG_PPIB_30bp_C- PPIB ACG 18 R7C GCCUUCAUGUUGCCUUflip_guide_30_13 CGGAGAGGCGCAGCA (SEQ ID NO: 530) GCG_PPIB_30bp_C- PPIBGCG 24 A19V GGACCCCCCGAUGAGG flip_guide_30_7 GCGGCGGCAAGGAGC (SEQ IDNO: 531) GCG_PPIB_30bp_C- PPIB GCG 22 A19V GCGGACCCCCCGAUGAflip_guide_30_9 GGGCGGCGGCAAGGA (SEQ ID NO: 532) GCG_PPIB_30bp_C- PPIBGCG 20 A19V GGACGGACCCCCCGAU flip_guide_30_11 GAGGGCGGCGGCAAG (SEQ IDNO: 533) GCG_PPIB_30bp_C- PPIB GCG 18 A19V GAAGACGGACCCCCCGflip_guide_30_13 AUGAGGGCGGCGGCA (SEQ ID NO: 534) CCG_PPIB_30bp_C- PPIBCCG 24 S21S GGAAGACCGACCCCGC flip_guide_30_7 GAUGAGGGCGGCGGC (SEQ IDNO: 535) CCG_PPIB_30bp_C- PPIB CCG 22 S21S GAAGAAGACCGACCCCflip_guide_30_9 GCGAUGAGGGCGGCG (SEQ ID NO: 536) CCG_PPIB_30bp_C- PPIBCCG 20 S21S GGGAAGAAGACCGACC flip_guide_30_11 CCGCGAUGAGGGCGG (SEQ IDNO: 537) CCG_PPIB_30bp_C- PPIB CCG 18 S21S GCAGGAAGAAGACCGAflip_guide_30_13 CCCCGCGAUGAGGGC (SEQ ID NO: 538) TCG_SMARCA4_30bp_C-SMARCA4 UCG 24 S85L GUCGUCCCACAUGCCC flip_guide_30_7 UUCUCAUGCAUGGAC(SEQ ID NO: 539) TCG_SMARCA4_30bp_C- SMARCA4 UCG 22 S85LGGGUCGUCCCACAUGC flip_guide_30_9 CCUUCUCAUGCAUGG (SEQ ID NO: 540)TCG_SMARCA4_30bp_C- SMARCA4 UCG 20 S85L GCGGGUCGUCCCACAUflip_guide_30_11 GCCCUUCUCAUGCAU (SEQ ID NO: 541) TCG_SMARCA4_30bp_C-SMARCA4 UCG 18 S85L GCGCGGGUCGUCCCAC flip_guide_30_13 AUGCCCUUCUCAUGC(SEQ ID NO: 542) ACG_SMARCA4_30bp_C- SMARCA4 ACG 24 D86DGCGGGUCCUCCGACAU flip_guide_30_7 GCCCUUCUCAUGCAU (SEQ ID NO: 543)ACG_SMARCA4_30bp_C- SMARCA4 ACG 22 D86D GCGCGGGUCCUCCGAC flip_guide_30_9AUGCCCUUCUCAUGC (SEQ ID NO: 544) ACG_SMARCA4_30bp_C- SMARCA4 ACG 20 D86DGAGCGCGGGUCCUCCG flip_guide_30_11 ACAUGCCCUUCUCAU (SEQ ID NO: 545)ACG_SMARCA4_30bp_C- SMARCA4 ACG 18 D86D GGUAGCGCGGGUCCUCflip_guide_30_13 CGACAUGCCCUUCUC (SEQ ID NO: 546) GCG_SMARCA4_30bp_C-SMARCA4 GCG 24 R89C GUGUAGCCCGGGUCGU flip_guide_30_7 CCGACAUGCCCUUCU(SEQ ID NO: 547) GCG_SMARCA4_30bp_C- SMARCA4 GCG 22 R89CGGUUGUAGCCCGGGUC flip_guide_30_9 GUCCGACAUGCCCUU (SEQ ID NO: 548)GCG_SMARCA4_30bp_C- SMARCA4 GCG 20 R89C GUGGUUGUAGCCCGGGflip_guide_30_11 UCGUCCGACAUGCCC (SEQ ID NO: 549) GCG_SMARCA4_30bp_C-SMARCA4 GCG 18 R89C GUCUGGUUGUAGCCCG flip_guide_30_13 GGUCGUCCGACAUGC(SEQ ID NO: 550) CCG_SMARCA4_30bp_C- SMARCA4 CCG 24 P88LGUAGCGCCGGUCGUCC flip_guide_30_7 GACAUGCCCUUCUCA (SEQ ID NO: 551)CCG_SMARCA4_30bp_C- SMARCA4 CCG 22 P88L GUGUAGCGCCGGUCGU flip_guide_30_9CCGACAUGCCCUUCU (SEQ ID NO: 552) CCG_SMARCA4_30bp_C- SMARCA4 CCG 20 P88LGGUUGUAGCGCCGGUC flip_guide_30_11 GUCCGACAUGCCCUU (SEQ ID NO: 553)CCG_SMARCA4_30bp_C- SMARCA4 CCG 18 P88L GUGGUUGUAGCGCCGGflip_guide_30_13 UCGUCCGACAUGCCC (SEQ ID NO: 554) NRAS_30bp_C- NRAS UCC28 I21I GUGCAUUGUCAGUGCG flip_guide_30_3 CUUUUCCCAACACCA (SEQ IDNO: 555) NRAS_30bp_C- NRAS UCC 26 I21I GGCUGCAUUGUCAGUG flip_guide_30_5CGCUUUUCCCAACAC (SEQ ID NO: 556) NRAS_30bp_C- NRAS UCC 24 I21IGUAGCUGCAUUGUCAG flip_guide_30_7 UGCGCUUUUCCCAAC (SEQ ID NO: 557)NRAS_30bp_C- NRAS UCC 22 I21I GAUUAGCUGCAUUGUC flip_guide_30_9AGUGCGCUUUUCCCA (SEQ ID NO: 558) NRAS_30bp_C- NRAS UCC 20 I21IGGGAUUAGCUGCAUUG flip_guide_30_11 UCAGUGCGCUUUUCC (SEQ ID NO: 559)NKFB1_30bp_C- NKFB1 ACC 28 P33S GUGCUUGAAAUACUUC flip_guide_30_3UGGAUUAAAUAUUGU (SEQ ID NO: 560) NKFB1_30bp_C- NKFB1 ACC 26 P33SGUGUGCUUGAAAUACU flip_guide_30_5 UCUGGAUUAAAUAUU (SEQ ID NO: 561)NKFB1_30bp_C- NKFB1 ACC 24 P33S GUCUGUGCUUGAAAUA flip_guide_30_7CUUCUGGAUUAAAUA (SEQ ID NO: 562) NKFB1_30bp_C- NKFB1 ACC 22 P33SGCAUCUGUGCUUGAAA flip_guide_30_9 UACUUCUGGAUUAAA (SEQ ID NO: 563)NKFB1_30bp_C- NKFB1 ACC 20 P33S GGCCAUCUGUGCUUGA flip_guide_30_11AAUACUUCUGGAUUA (SEQ ID NO: 564) EZH2_30bp_C- EZH2 UCA 28 F32FGCUCAACCUCUUGAGC flip_guide_30_3 UGUCUCAGUCGCAUG (SEQ ID NO: 565)EZH2_30bp_C- EZH2 UCA 26 F32F GGUCUCAACCUCUUGA flip_guide_30_5GCUGUCUCAGUCGCA (SEQ ID NO: 566) EZH2_30bp_C- EZH2 UCA 24 F32FGUCGUCUCAACCUCUU flip_guide_30_7 GAGCUGUCUCAGUCG (SEQ ID NO: 567)EZH2_30bp_C- EZH2 UCA 22 F32F GGCUCGUCUCAACCUC flip_guide_30_9UUGAGCUGUCUCAGU (SEQ ID NO: 568) EZH2_30bp_C- EZH2 UCA 20 F32FGCAGCUCGUCUCAACC flip_guide_30_11 UCUUGAGCUGUCUCA (SEQ ID NO: 569)NF2_30bp_C- NF2 ACG 28 T21M GACCUCUUGGGUUGCU flip_guide_30_3UCCUCUUGAGAGAGC (SEQ ID NO: 570) NF2_30bp_C- NF2 ACG 26 T21MGGAACCUCUUGGGUUG flip_guide_30_5 CUUCCUCUUGAGAGA (SEQ ID NO: 571)NF2_30bp_C- NF2 ACG 24 T21M GGUGAACCUCUUGGGU flip_guide_30_7UGCUUCCUCUUGAGA (SEQ ID NO: 572) NF2_30bp_C- NF2 ACG 22 T21MGCGGUGAACCUCUUGG flip_guide_30_9 GUUGCUUCCUCUUGA (SEQ ID NO: 573)NF2_30bp_C- NF2 ACG 20 T21M GCACGGUGAACCUCUU flip_guide_30_11GGGUUGCUUCCUCUU (SEQ ID NO: 574) RAF1_30bp_C- RAF1 UCC 28 P30SGAGCAGAGAUGCAGCU flip_guide_30_3 GGAGCCAUCAAACAC (SEQ ID NO: 575)RAF1_30bp_C- RAF1 UCC 26 P30S GGUAGCAGAGAUGCAG flip_guide_30_5CUGGAGCCAUCAAAC (SEQ ID NO: 576) RAF1_30bp_C- RAF1 UCC 24 P30SGUUGUAGCAGAGAUGC flip_guide_30_7 AGCUGGAGCCAUCAA (SEQ ID NO: 577)RAF1_30bpC- RAF1 UCC 22 P30S GUAUUGUAGCAGAGAU flip_guide_30_9GCAGCUGGAGCCAUC (SEQ ID NO: 578) RAF1_30bp_C- RAF1 UCC 20 P30SGACUAUUGUAGCAGAG flip_guide_30_11 AUGCAGCUGGAGCCA (SEQ ID NO: 579)NRAS_30bp_T- NRAS UCC 28 I21I GUGUAUUGUCAGUGCG flip_guide_30_3CUUUUCCCAACACCA (SEQ ID NO: 580) NRAS_30bp_T- NRAS UCC 26 I21IGGCUGUAUUGUCAGUG flip_guide_30_5 CGCUUUUCCCAACAC (SEQ ID NO: 581)NRAS_30bp_T- NRAS UCC 24 I21I GUAGCUGUAUUGUCAG flip_guide_30_7UGCGCUUUUCCCAAC (SEQ ID NO: 582) NRAS_30bp_T- NRAS UCC 22 I21IGAUUAGCUGUAUUGUC flip_guide_30_9 AGUGCGCUUUUCCCA (SEQ ID NO: 583)NRAS_30bp_T- NRAS UCC 20 I21I GGGAUUAGCUGUAUUG flip_guide_30_11UCAGUGCGCUUUUCC (SEQ ID NO: 584) NKFB1_30bp_T- NKFB1 ACC 28 P33SGUGUUUGAAAUACUUC flip_guide_30_3 UGGAUUAAAUAUUGU (SEQ ID NO: 585)NKFB1_30bp_T- NKFB1 ACC 26 P33S GUGUGUUUGAAAUACU flip_guide_30_5UCUGGAUUAAAUAUU (SEQ ID NO: 586) NKFB1_30bp_T- NKFB1 ACC 24 P33SGUCUGUGUUUGAAAUA flip_guide_30_7 CUUCUGGAUUAAAUA (SEQ ID NO: 587)NKFB1_30bp_T- NKFB1 ACC 22 P33S GCAUCUGUGUUUGAAA flip_guide_30_9UACUUCUGGAUUAAA (SEQ ID NO: 588) NKFB1_30bp_T- NKFB1 ACC 20 P33SGGCCAUCUGUGUUUGA flip_guide_30_11 AAUACUUCUGGAUUA (SEQ ID NO: 589)EZH2_30bp_T- EZH2 UCA 28 F32F GCUUAACCUCUUGAGC flip_guide_30_3UGUCUCAGUCGCAUG (SEQ ID NO: 590) EZH2_30bp_T- EZH2 UCA 26 F32FGGUCUUAACCUCUUGA flip_guide_30_5 GCUGUCUCAGUCGCA (SEQ ID NO: 591)EZH2_30bp_T- EZH2 UCA 24 F32F GUCGUCUUAACCUCUU flip_guide_30_7GAGCUGUCUCAGUCG (SEQ ID NO: 592) EZH2_30bp_T- EZH2 UCA 22 F32FGGCUCGUCUUAACCUC flip_guide_30_9 UUGAGCUGUCUCAGU (SEQ ID NO: 593)EZH2_30bpT- EZH2 UCA 20 F32F GCAGCUCGUCUUAACC flip_guide_30_11UCUUGAGCUGUCUCA (SEQ ID NO: 594) NF2_30bp_T- NF2 ACG 28 T21MGACUUCUUGGGUUGCU flip_guide_30_3 UCCUCUUGAGAGAGC (SEQ ID NO: 595)NF2_30bp_T- NF2 ACG 26 T21M GGAACUUCUUGGGUUG flip_guide_30_5CUUCCUCUUGAGAGA (SEQ ID NO: 596) NF2_30bp_T- NF2 ACG 24 T21MGGUGAACUUCUUGGGU flip_guide_30_7 UGCUUCCUCUUGAGA (SEQ ID NO: 597)NF2_30bp_T- NF2 ACG 22 T21M GCGGUGAACUUCUUGG flip_guide_30_9GUUGCUUCCUCUUGA (SEQ ID NO: 598) NF2_30bp_T- NF2 ACG 20 T21MGCACGGUGAACUUCUU flip_guide_30_11 GGGUUGCUUCCUCUU (SEQ ID NO: 599)RAF1_30bp_T- RAF1 UCC 28 P30S GAGUAGAGAUGCAGCU flip_guide_30_3GGAGCCAUCAAACAC (SEQ ID NO: 600) RAF1_30bp_T- RAF1 UCC 26 P30SGGUAGUAGAGAUGCAG flip_guide_30_5 CUGGAGCCAUCAAAC (SEQ ID NO: 601)RAF1_30bp_T- RAF1 UCC 24 P30S GUUGUAGUAGAGAUGC flip_guide_30_7AGCUGGAGCCAUCAA (SEQ ID NO: 602) RAF1_30bp_T- RAF1 UCC 22 P30SGUAUUGUAGUAGAGAU flip_guide_30_9 GCAGCUGGAGCCAUC (SEQ ID NO: 603)RAF1_30bp_T- RAF1 UCC 20 P30S GACUAUUGUAGUAGAG flip_guide_30_11AUGCAGCUGGAGCCA (SEQ ID NO: 604)

TABLE 23 Guide sequences used for synthetic target editing TargetedBase flip/ Codon Name gene Motif position change Spacer sequenceNM_000016.5_C- ACADM ACA C/7 H67Y GUAUCAUCUUCUGCAGCC flip_guideACUGGGAUGAUUU (SEQ ID NO: 605) NM_000018.3_C- ACADVL GCG C/9 A283VGUCUCCACCCCAAAAGCU flip_guide GUGAUCUUCUCCU (SEQ ID NO: 606)NM_000071.2_C- CBS GCG C/9 R109C GGAACUCACCCUUGGCCA flip_guideAGAGCUCACACUU (SEQ ID NO: 607) NM_000138.4_C- FBN1 GCG C/5 R1408CGGAGCCCUCAUCAAGGUC flip_guide UGUACAAGUGAAG (SEQ ID NO: 608)NM_000141.4_C- FGFR2 CCC C/7 P267S GCUGUGGCGGCAUUUGCC flip_guideGGCAGUCCGGCUU (SEQ ID NO: 609) NM_000152.4_C- GAA CCC C/7 P552LGGCCUGGCGGGUCCCCCC flip_guide AACCACCCCAGGC (SEQ ID NO: 610)NM_000341.3_C- SLC3A1 ACG C/7 T467M GAGAAGCCUGUUCAUCAC flip_guideGUUGACAUACUGA (SEQ ID NO: 611) NM_000375.2_C- UROS ACG C/9 R73CGCUCCAAACCUAACUCUG flip_guide CUGCUUCCACUGC (SEQ ID NO: 612)NM_000431.3_C- MVK ACA C/9 T268I GUGGCAUCUCUUGAGGUC flip_guideAGGAGGGGGGCCA (SEQ ID NO: 613) NM_000551.3_C- VHL CCG C/7 P158LGUCUUUCCGAGUAUACAC flip_guide UGGCAGUGUGAUA (SEQ ID NO: 614)NM_001256850.1_C- TTN ACG C/9 R30071C GCUUUCCACCUGGGCCAG flip_guideGGGAAUCAAGCAC (SEQ ID NO: 615) NM_002397.4_C- MEF2C ACG C/9 T1MGUUCUCCCCCUAGUCCCC flip_guide GUUUUUCUUCUCU (SEQ ID NO: 616)NM_002474.2_C- MYH11 CCG C/9 P1264L GUGGACUGCCGCUCCUGC flip_guideACCUGCGCCUCCA (SEQ ID NO: 617) NM_002834.4_C- PTPN11 CCU C/9 L285FGAUGAUCAACGGGCAGGA flip_guide UGUUUUUAUAUCU (SEQ ID NO: 618)NM_004004.5_C- GJB2 ACG C/5 R77W GGCCCCUAGCCGGAUGUG flip_guideGGAGAUGGGGAAG (SEQ ID NO: 619) NM_004572.3_C- PKP2 CCG C/9 R796CGUGUGUAACCGGCAGAGG flip_guide CUGUAGUUUCAAU (SEQ ID NO: 620)NM_005609.3_C- PYGM GCG C/9 R798W GCCGCGUCCCCUCUCUUG flip_guideGGUUCUUGUACAA (SEQ ID NO: 621) NM_005633.3_C- SOS1 ACG C/9 T269MGCAUCUGUCCUUUCUACU flip_guide GUAUCUUCUAUAU (SEQ ID NO: 622)NM_014139.2_C- SCN11A CCG C/9 P396L GCAACAGCCCGGGUUAAG flip_guideUUAAUCAGGUAGA (SEQ ID NO: 623) NM_014874.3_C- MFN2 CCG C/9 P76LGCAACAGCCCGGGUUAAG flip_guide UUAAUCAGGUAGA (SEQ ID NO: 624)NM_015559.2_C- SETBP1 ACU C/7 T871I GGUCCCACUGCCGCUGUC flip_guideGCUGGGGAUCGUC (SEQ ID NO: 625) NM_020630.4_C- RET CCG C/5 R620CGUCGCCGAAGCACUUCUC flip_guide CUCCUCAGGGAAG (SEQ ID NO: 626)NM_000016.5_F_30bp_C- ACADM ACA C/5 H67Y GUCAUCUUCUGCAGCCACflip_guide_30_5 UGGGAUGAUUUCC (SEQ ID NO: 627) NM_000016.5_F_30bp_C-ACADM ACA C/7 H67Y GUAUCAUCUUCUGCAGCC flip_guide_30_7 ACUGGGAUGAUUU(SEQ ID NO: 628) NM_000016.5_F_30bp_C- ACADM ACA C/9 H67YGUUUAUCAUCUUCUGCAG flip_guide_30_9 CCACUGGGAUGAU (SEQ ID NO: 629)NM_000018.3_F_30bp_C- ACADVL GCG C/5 A283V GCACCCCAAAAGCUGUGAflip_guide_30_5 UCUUCUCCUUCAC (SEQ ID NO: 630) NM_000018.3_F_30bp_C-ACADVL GCG C/7 A283V GUCCACCCCAAAAGCUGU flip_guide_30_7 GAUCUUCUCCUUC(SEQ ID NO: 631) NM_000018.3_F_30bp_C- ACADVL GCG C/9 A283VGUCUCCACCCCAAAAGCU flip_guide_30_9 GUGAUCUUCUCCU (SEQ ID NO: 632)NM_000071.2_F_30bp_C- CBS GCG C/5 R109C GUCACCCUUGGCCAAGAGflip_guide_30_5 CUCACACUUCAGG (SEQ ID NO: 633) NM_000071.2_F_30bp_C- CBSGCG C/7 R109C GACUCACCCUUGGCCAAG flip_guide_30_7 AGCUCACACUUCA(SEQ ID NO: 634) NM_000071.2_F_30bp_C- CBS GCG C/9 R109CGGAACUCACCCUUGGCCA flip_guide_30_9 AGAGCUCACACUU (SEQ ID NO: 635)NM_000138.4_F_30bp_C- FBN1 GCG C/5 R1408C GGAGCCCUCAUCAAGGUCflip_guide_30_5 UGUACAAGUGAAG (SEQ ID NO: 636) NM_000138.4_F_30bp_C-FBN1 GCG C/7 R1408C GCAGAGCCCUCAUCAAGG flip_guide_30_7 uCUGUACAAGUGA(SEQ ID NO: 637) NM_000138.4_F_30bp_C- FBN1 GCG C/9 R1408CGCUCAGAGCCCUCAUCAA flip_guide_30_9 GGUCUGUACAAGU (SEQ ID NO: 638)NM_000141.4_F_30bp_C- FGFR2 CCC C/5 P267S GGUGGCGGCAUUUGCCGGflip_guide_30_5 CAGUCCGGCUUGG (SEQ ID NO: 639) NM_000141.4_F_30bp_C-FGFR2 CCC C/7 P267S GCUGUGGCGGCAUUUGCC flip_guide_30_7 GGCAGUCCGGCUU(SEQ ID NO: 640) NM_000141.4_F_30bp_C- FGFR2 CCC C/9 P267SGCACUGUGGCGGCAUUUG flip_guide_30_9 CCGGCAGUCCGGC (SEQ ID NO: 641)NM_000152.4_F_30bp_C- GAA CCC C/5 P552L GCUGGCGGGUCCCCCCAAflip_guide_30_5 CCACCCCAGGCAC (SEQ ID NO: 642) NM_000152.4_F_30bp_C- GAACCC C/7 P552L GGCCUGGCGGGUCCCCCC flip_guide_30_7 AACCACCCCAGGC(SEQ ID NO: 643) NM_000152.4_F_30bp_C- GAA CCC C/9 P552LGCCGCCUGGCGGGUCCCC flip_guide_30_9 CCAACCACCCCAG (SEQ ID NO: 644)NM_000341.3_F_30bp_C- SLC3A1 ACG C/5 T467M GAAGCCUGUUCAUCACGUflip_guide_30_5 UGACAUACUGAUU (SEQ ID NO: 645) NM_000341.3_F_30bp_C-SLC3A1 ACG C/7 T467M GAGAAGCCUGUUCAUCAC flip_guide_30_7 GUUGACAUACUGA(SEQ ID NO: 646) NM_000341.3_F_30bp_C- SLC3A1 ACG C/9 T467MGAAAGAAGCCUGUUCAUC flip_guide_30_9 ACGUUGACAUACU (SEQ ID NO: 647)NM_000375.2_F_30bp_C- UROS ACG C/5 R73C GAAACCUAACUCUGCUGCflip_guide_30_5 UUCCACUGCUCUG (SEQ ID NO: 648) NM_000375.2_F_30bp_C-UROS ACG C/7 R73C GCCAAACCUAACUCUGCU flip_guide_30_7 GCUUCCACUGCUC(SEQ ID NO: 649) NM_000375.2_F_30bp_C- UROS ACG C/9 R73CGCUCCAAACCUAACUCUG flip_guide_30_9 CUGCUUCCACUGC (SEQ ID NO: 650)NM_000431.3_F_30bp_C- MVK ACA C/5 T268I GAUCUCUUGAGGUCAGGAflip_guide_30_5 GGGGGGCCACGAU (SEQ ID NO: 651) NM_000431.3_F_30bp_C- MVKACA C/7 T268I GGCAUCUCUUGAGGUCAG flip_guide_30_7 GAGGGGGGCCACG(SEQ ID NO: 652) NM_000431.3_F_30bp_C- MVK ACA C/9 T268IGUGGCAUCUCUUGAGGUC flip_guide_30_9 AGGAGGGGGGCCA (SEQ ID NO: 653)NM_000551.3_F_30bp_C- VHL CCG C/5 P158L GUUUCCGAGUAUACACUGflip_guide_30_5 GCAGUGUGAUAUU (SEQ ID NO: 654) NM_000551.3_F_30bp_C- VHLCCG C/7 P158L GUCUUUCCGAGUAUACAC flip_guide_30_7 UGGCAGUGUGAUA(SEQ ID NO: 655) NM_000551.3_F_30bp_C- VHL CCG C/9 P158LGGCUCUUUCCGAGUAUAC flip_guide_30_9 ACUGGCAGUGUGA (SEQ ID NO: 656)NM_001256850.1_F_30bp_C- TTN ACG C/5 R30071C GCCACCUGGGCCAGGGGAflip_guide_30_5 AUCAAGCACUUUG (SEQ ID NO: 657) NM_001256850.1_F_30bp_C-TTN ACG C/7 R30071C GUUCCACCUGGGCCAGGG flip_guide_30_7 GAAUCAAGCACUU(SEQ ID NO: 658) NM_001256850.1_F_30bp_C TTN ACG C/9 R30071CGCUUUCCACCUGGGCCAG flip_guide_30_9 GGGAAUCAAGCAC (SEQ ID NO: 659)NM_002397.4_F_30bp_C- MEF2C ACG C/5 T1M GCCCCCUAGUCCCCGUUUflip_guide_30_5 UUCUUCUCUCUCU (SEQ ID NO: 660) NM_002397.4_F_30bp_C-MEF2C ACG C/7 T1M GCUCCCCCUAGUCCCCGU flip_guide_30_7 UUUUCUUCUCUCU(SEQ ID NO: 661) NM_002397.4_F_30bp_C- MEF2C ACG C/9 T1MGUUCUCCCCCUAGUCCCC flip_guide_30_9 GUUUUUCUUCUCU (SEQ ID NO: 662)NM_002474.2_F_30bp_C- MYH11 CCG C/5 P1264L GCUGCCGCUCCUGCACCUflip_guide_30_5 GCGCCUCCAGCUU (SEQ ID NO: 663) NM_002474.2_F_30bp_C-MYH11 CCG C/7 P1264L GGACUGCCGCUCCUGCAC flip_guide_30_7 CUGCGCCUCCAGC(SEQ ID NO: 664) NM_002474.2_F_30bp_C- MYH11 CCG C/9 P1264LGUGGACUGCCGCUCCUGC flip_guide_30_9 ACCUGCGCCUCCA (SEQ ID NO: 665)NM_002834.4_F_30bp_C- PTPN11 CCU C/5 L285F GUCAACGGGCAGGAUGUUflip_guide_30_5 UUUAUAUCUAUUU (SEQ ID NO: 666) NM_002834.4_F_30bp_C-PTPN11 CCU C/7 L285F GGAUCAACGGGCAGGAUG flip_guide_30_7 UUUUUAUAUCUAU(SEQ ID NO: 667) NM_002834.4_F_30bp_C- PTPN11 CCU C/9 L285FGAUGAUCAACGGGCAGGA flip_guide_30_9 UGUUUUUAUAUCU (SEQ ID NO: 668)NM_004004.5_F_30bp_C- GJB2 ACG C/5 R77W GGCCCCUAGCCGGAUGUGflip_guide_30_5 GGAGAUGGGGAAG (SEQ ID NO: 669) NM_004004.5_F_30bp_C-GJB2 ACG C/7 R77W GGGGCCCCUAGCCGGAUG flip_guide_30_7 UGGGAGAUGGGGA(SEQ ID NO: 670) NM_004004.5_F_30bp_C- GJB2 ACG C/9 R77WGCAGGGCCCCUAGCCGGA flip_guide_30_9 UGUGGGAGAUGGG (SEQ ID NO: 671)NM_004572.3_F_30bp_C- PKP2 CCG C/5 R796C GUAACCGGCAGAGGCUGUflip_guide_30_5 AGUUUCAAUGAGA (SEQ ID NO: 672) NM_004572.3_F_30bp_C-PKP2 CCG C/7 R796C GUGUAACCGGCAGAGGCU flip_guide_30_7 GUAGUUUCAAUGA(SEQ ID NO: 673) NM_004572.3_F_30bp_C- PKP2 CCG C/9 R796CGUGUGUAACCGGCAGAGG flip_guide_30_9 CUGUAGUUUCAAU (SEQ ID NO: 674)NM_005609.3_F_30bp_C- PYGM GCG C/5 R798W GGUCCCCUCUCUUGGGUUflip_guide_30_5 CUUGUACAAGGCG (SEQ ID NO: 675) NM_005609.3_F_30bp_C-PYGM GCG C/7 R798W GGCGUCCCCUCUCUUGGG flip_guide_30_7 UUCUUGUACAAGG(SEQ ID NO: 676) NM_005609.3_F_30bp_C- PYGM GCG C/9 R798WGCCGCGUCCCCUCUCUUG flip_guide_30_9 GGUUCUUGUACAA (SEQ ID NO: 677)NM_005633.3_F_30bp_C- SOS1 ACG C/5 T269M GUGUCCUUUCUACUGUAUflip_guide_30_5 CUUCUAUAUGGCC (SEQ ID NO: 678) NM_005633.3_F_30bp_C-SOS1 ACG C/7 T269M GUCUGUCCUUUCUACUGU flip_guide_30_7 AUCUUCUAUAUGG(SEQ ID NO: 679) NM_005633.3_F_30bp_C- SOS1 ACG C/9 T269MGCAUCUGUCCUUUCUACU flip_guide_30_9 GUAUCUUCUAUAU (SEQ ID NO: 680)NM_014139.2_F_30bp_C- SCN11A CCG C/5 P396L GAGCCCGGGUUAAGUUAAflip_guide_30_5 UCAGGUAGAAGGA (SEQ ID NO: 681) NM_014139.2_F_30bp_C-SCN11A CCG C/7 P396L GACAGCCCGGGUUAAGUU flip_guide_30_7 AAUCAGGUAGAAG(SEQ ID NO: 682) NM_014139.2_F_30bp_C- SCN11A CCG C/9 P396LGCAACAGCCCGGGUUAAG flip_guide_30_9 UUAAUCAGGUAGA (SEQ ID NO: 683)NM_014874.3_F_30bp_C- MFN2 CCG C/5 P76L GGUCCCGAACCUGUUCUUflip_guide_30_5 CUGUGGUAACGGG (SEQ ID NO: 684) NM_014874.3_F_30bp_C-MFN2 CCG C/7 P76L GACGUCCCGAACCUGUUC flip_guide_30_7 UUCUGUGGUAACG(SEQ ID NO: 685) NM_014874.3_F_30bp_C- MFN2 CCG C/9 P76LGUGACGUCCCGAACCUGU flip_guide_30_9 UCUUCUGUGGUAA (SEQ ID NO: 686)NM_015559.2_F_30bp_C- SETBP1 ACU C/5 T871I GCCCACUGCCGCUGUCGCflip_guide_30_5 UGGGGAUCGUCUC (SEQ ID NO: 687) NM_015559.2_F_30bp_C-SETBP1 ACU C/7 T871I GGUCCCACUGCCGCUGUC flip_guide_30_7 GCUGGGGAUCGUC(SEQ ID NO: 688) NM_015559.2_F_30bp_C- SETBP1 ACU C/9 T871IGCUGUCCCACUGCCGCUG flip_guide_30_9 UCGCUGGGGAUCG (SEQ ID NO: 689)NM_020630.4_F_30bp_C- RET CCG C/5 R620C GUCGCCGAAGCACUUCUCflip_guide_30_5 CUCCUCAGGGAAG (SEQ ID NO: 690) NM_020630.4_F_30bp_C- RETCCG C/7 R620C GGCUCGCCGAAGCACUUC flip_guide_30_7 UCCUCCUCAGGGA(SEQ ID NO: 691) NM_020630.4_F_30bp_C- RET CCG C/9 R620CGGGGCUCGCCGAAGCACU flip_guide_30_9 UCUCCUCCUCAGG (SEQ ID NO: 692) ApoE4APOE GCG C/30 C130R Gccacguccuccaugucc rs429358 C gcgcccagccggg flip 30(SEQ ID NO: 693) ApoE4 APOE GCG C/28 C130R Ggcccacguccuccaugu rs429358 Cccgcgcccagccg flip 28 (SEQ ID NO: 694) ApoE4 APOE GCG C/26 C130RGccgcccacguccuccau rs429358 C guccgcgcccagc flip 26 (SEQ ID NO: 695)ApoE4 APOE GCG C/24 C130R Gggccgcccacguccucc rs429358 C auguccgcgcccaflip 24 (SEQ ID NO: 696) ApoE4 APOE GCG C/22 C130R Ggcggccgcccacguccurs429358 C ccauguccgcgcc flip 22 (SEQ ID NO: 697) ApoE4 APOE GCG C/20C130R Gaggcggccgcccacguc rs429358 C cuccauguccgcg flip 20(SEQ ID NO: 698) ApoE4 APOE GCG C/18 C130R Gccaggcggccgcccacg rs429358 Cuccuccauguccg flip 18 (SEQ ID NO: 699) ApoE4 APOE GCG C/16 C130RGcaccaggcggccgccca rs429358 C cguccuccauguc flip 16 (SEQ ID NO: 700)ApoE4 rs7412 APOE GCG C/30 C176R Gccuucugcaggucaucg C flip 30gcaucgcggagga (SEQ ID NO: 701) ApoE4 rs7412 APOE GCG C/28 C176RGgcccuucugcaggucau C flip 28 cggcaucgcggag (SEQ ID NO: 702) ApoE4 rs7412APOE GCG C/26 C176R Gaggcccuucugcagguc C flip 26 aucggcaucgcgg(SEQ ID NO: 703) ApoE4 rs7412 APOE GCG C/24 C176R GccaggcccuucugcaggC flip 24 ucaucggcaucgc (SEQ ID NO: 704) ApoE4 rs7412 APOE GCG C/22C176R Gugccaggcccuucugca C flip 22 ggucaucggcauc (SEQ ID NO: 705)ApoE4 rs7412 APOE GCG C/20 C176R Gacugccaggcccuucug C flip 20caggucaucggca (SEQ ID NO: 706) ApoE4 rs7412 APOE GCG C/18 C176RGacacugccaggcccuuc C flip 18 ugcaggucaucgg (SEQ ID NO: 707) ApoE4 rs7412APOE GCG C/16 C176R Gguacacugccaggcccu C flip 16 ucugcaggucauc(SEQ ID NO: 708)

TABLE 24 Mammalian plasmids and maps Plasmid Description Benchling linkpC0043 PspCas13b crRNA benchling.com/s/seq- backboneOH6nMmZCZn930BWqcFNa pC0076 CMV-dCas13b6-mapkNES- benchling.com/s/seq-GS-dADAR2 E488Q BulRvsrtwP4aEJtTqYM2 pC0077 pCMV-dCas13b6-mapkNES-benchling.com/s/seq- GS-dADAR2(E488Q/V351G/ gQ13PMPLkcO6OceAfmpCS486A/T375S/S370C/P462A/ N597I/L332I/I398V) RESCUEv8 pC0078pCMV-dCas13b6-mapkNES- benchling.com/s/seq- GS-dADAR2(E488Q/V351G/19Ytwwh0i0vSIbyXYZ95 S486A/T375S/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T) V16 pC0079pCMV-dCas13b6-mapkNES- benchling.com/s/seq- GS-dADAR2(E488Q/V351G/WX6VnavLS6JaaZ54XAOx S486A/T375A/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T) V16S pC0080pCMV-dCas13b12-HIVNES- benchling.com/s/seq- GS-dADAR2(E488Q/V351G/GQqPCRE916KnEfHksQem S486A/T375S/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T) V16 pC0081pCMV-dCas13b12-HIVNES- benchling.com/s/seq- GS-dADAR2(E488Q/V351G/qjbEAXZgupeRXBa8ablS S486A/T375S/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T/ S375A) V16S pC0082CMV-Cluciferase-polyA EF1a- benchling.com/s/seq-G-luciferase(C82R)-polyA C Qjsg3Yx0r1Hs77GT58BI to U reporter TCG motifpC0083 CMV-Cluciferase-polyA EF1a- benchling.com/s/seq-G-luciferase(C82R)-polyA C Z8zwu3LdetcuYHAFGnpe to U reporter GCG motifpC0084 CMV-Cluciferase-polyA EF1a- benchling.com/s/seq-G-luciferase(C82R)-polyA C G2Iag6I8NBQAXqbJnou5 to U reporter ACG motifpC0085 CMV-Cluciferase-polyA EF1a- benchling.com/s/seq-G-luciferase(C82R)-polyA C alkwhNUsFTg80TVmpquP to U reporter CCG motifpC0086 CMV-Cluciferase-polyA EF1a- benchling.com/s/seq-G-luciferase(L77P)-polyA C 1J8Fm6vtF7GS676Q7pwS to U reporter CCA motifpC0087 CMV-Cluciferase-polyA EF1a- benchling.com/s/seq-G-luciferase(L77P)-polyA C 5MMokwvxoAjq6ML2sjjZ to U reporter CCT motifpC0088 pCMV-ADAR2dd(E488Q/ benchling.com/s/seq- V351G/S486A/T375S/YISAybq2YnuclVwYDy95 S370C/P462A/N597I/ L332I/I398V/K350I/M383L/D619G/S582T/ V440I/S495N/K418E/ S661T) V16 pC0089 pCMV-ADAR2 fulllength benchling.com/s/seq- (E488Q/V351G/S486A/T375S/95ZpoHj9GhQFzIu3m6cb S370C/P462A/N597I/L332I/ I398V/K350I/M383L/D619G/S582T/V440I/S495N/K418E/ S661T) V16 pC0090 Beta catenin reporter M50benchling.com/s/seq- Super 8x (TCF/LEF binding jPxZnxs3wSeKZhgTTDBusites) TOPFlash with Gluc/Cluc pC0091 Beta catenin reporterbenchling.com/s/seq- control M51 Super 8x 130b6c9baCfw8R3lTgSR (mutatedTCF/LEF binding sites) FOPFlash with Gluc/Cluc

TABLE 25 Yeast plasmids and maps Plasmid Description Benchling linkpC0092 pGAL-dCas13b6-GS- benchling.com/s/seq- dADAR2 [RESCUE v0 Yeast]w1l2aOHR2gSe4P2aQ7VY pC0093 pGAL-dCas13b6-GS- benchling.com/s/seq-dADAR2(V351/S486A/T375S) saQngvNf6i3GhSGF0H3I [RESCUE v3 Yeast] pC0094pGAL-dCas13b6-GS- benchling.com/s/seq- dADAR2(V351G/S486A/T375S/GIJ7BnpV3Vd3XtKiIxdm S370C/P462A/L332I) [RESCUE v7 Yeast] pC0095pGAL-dCas13b6-GS- benchling.com/s/seq- dADAR2(V351/S486A/T375S/vRnAMIwozJk5r6LmOCgG S370C/P462A/N597I/L332I/ I398V/K350I/M383L/D619G/S582T/V440I/S495N/K418/ ES661T) [RESCUE v16 Yeast] pC0096 pYES3/CTpADH1-HH- benchling.com/s/seq- Targeting-B6_DR-HDV--space-Xs2ffVMn4FwwQ79zDDEo ADH1_terminator His (P196L) [Yeast target HisP196L] pC0097 pYES3/CT pADH1-HH- benchling.com/s/seq- Golden-gate-BsmBi-UM9NjG7JKK0GFe9MowGo BsmbI_DR--HDV-space- ADH1_terminator His (P196L)[Yeast target His P196L NT] pC0098 pYES3/CT pADH1-HH-benchling.com/s/seq- Guide-B6_DR--HDV-space- EefJI5brqll3fm0B5Qc5ADH1_terminator His S129P [Yeast target His S129P] pC0099 pYES3/CTpADH1-HH- benchling.com/s/seq- Golden-gate-BsmBi- bt7gOlrp8OuOoV3YJWZGBsmbI_DR--HDV-space- ADH1_terminator His Motifs S129P [Yeast target HisS129P NT] pC0100 pYES3/CT pADH1-HH- benchling.com/s/seq-Y66H-targeting-B6-DR-HDV- HiMELqTYPT9y0nOAKEq2 ADH1-term ATG-yeGFP Y66H[Yeast target GFP Y66H] pC0101 pYES3/CT pADH1-HH- benchling.com/s/seq-Golden-gate-BsmBi- OCWlvnjeKYwSbG8GELTQ BsmbI_-B6_DR-HDV- ADH1-termATG-yeGFP Y66H Reporter [Yeast target GFP Y66H NT]

TABLE 26 Guide sequences used for yeast targeting Base flip/spacerTargeted length/ Codon Name gene Motif position change Spacer sequenceHis L196P HIS CCU U/50/34 L196P Ucuuauggcaaccgcaug targetingagccuugaacgcacucuc acuacggugaugau (SEQ ID NO: 709) His S129P HIS UCCC/30/26 S129P Gcuugcaagugccucauc targeting caaaggcgcaaau(SEQ ID NO: 710) His Y66H EGFP UCA U/50/34 Y66H Aaacauugaacaccauuatargeting guuaaaguagugacuaag guuggccauggaac (SEQ ID NO: 711)

Mammalian Cell Culture

Unless otherwise stated, mammalian cell culture experiments wereperformed in the HEK293FT line (American Type Culture Collection(ATCC)), grown in Dulbecco's Modified Eagle Medium containing glucose,sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), andsupplemented with 1× penicillin-streptomycin (Thermo Fisher Scientific)and 10% fetal bovine serum (VWR Seradigm). Cells were maintained atconfluency below 80%.

Unless otherwise noted, all transfections were performed withLipofectamine 2000 (Thermo Fisher Scientific) in 96-well plates coatedwith poly-D-lysine (BD Biocoat). Cells were plated at approximately20,000 cells/well 16 hours prior to transfection to ensure 90%confluency at the time of transfection. For each well on the plate,transfection plasmids were combined with Opti-MEM I Reduced Serum Medium(Thermo Fisher Scientific) to a total of 25 μl. Separately, 24.5 μl ofOpti-MEM was combined with 0.5 μl of Lipofectamine 2000. Plasmid andLipofectamine solutions were then combined and incubated for 5 minutes,after which they were pipetted onto cells.

RESCUE Editing in Mammalian Cells

To assess RESCUE activity in mammalian cells, Applicants transfected 150ng of RESCUE vector, 300 ng of guide expression plasmid, and, when usinga reporter (either luciferase, STAT activity, or Beta Catenin activity),40 ng of the RNA editing reporter. After 48 hours, RNA from cells washarvested and reverse transcribed using a method previouslydescribed(33) with a gene specific reverse transcription primer. Theextracted cDNA was then subjected to two rounds of PCR to add Illuminaadaptors and sample barcodes using NEBNext High-Fidelity 2×PCR MasterMix (New England Biolabs). The library was then subjected to nextgeneration sequencing on an Illumina NextSeq or MiSeq. RNA editing rateswere then evaluated at all adenosines within the sequencing window.

In experiments where the luciferase reporter was targeted for RNAediting, Applicants also harvested the media with secreted luciferaseprior to RNA harvest. Applicants measured luciferase activity withCypridinia and Gaussia luciferase assay kits (Targeting Systems) on aplate reader (Biotek Synergy Neo2) with an injection protocol. Allreplicates performed are biological replicates.

In experiments where the input amount of RESCUE plasmid was varied,total plasmid amount was kept constant by replacing RESCUE expressionplasmid with a filler plasmid expressing a CMV-driven mScarlet, exceptwhere noted. In the experiment where input amount of guide plasmid wasvaried, total plasmid amount was either kept constant (“with fillerplasmid”) via substitution of non-targeting guide, or not kept constant(“without filler plasmid”); in this experiment, there was no fillerplasmid for the RESCUE plasmid.

Biochemical Characterization of RESCUE Mutations on ADAR2dd

To assess the kinetic activity of hADAR2 deaminase domain containingRESCUE mutations, multiple iterations were cloned into apGAL-His6-TwinStrep-SUMO-hADAR2dd backbone containing the URA3 gene. Theplasmids were transformed into BCY123 competent yeast cells. Briefly,frozen cells were thawed in 37° C. water bath for 15-30 seconds. 10 μLof cells per condition were centrifuged at 13,000 g in a microcentrifugefor 2 minutes and supernatant was removed. The prepared transformationmix for each construct contained 260 μL PEG 3350 prepared at 50% w/v, 50μL of denatured salmon sperm (Thermo Fisher Scientific), 36 μL 1MLithium Acetate, and 750 ng of plasmid in 14 μL of DI H2O. The yeastpellet was resuspended with the transformation mix and incubated in a42° C. water bath for 30 minutes before centrifugation at 13,000 g for30 seconds and subsequent supernatant removal. The pellet was thenresuspended in 1 mL of DI H2O and 50 μL was taken into 1 mL of DI H2Ofor mixing. Subsequently, 200 μL was plated onto minimal glucose platesminus uracil for prototrophic selection.

Plates were incubated at 30° C. for 48 hr before seeding single coloniesinto 10 mL cultures of yeast minimal media supplemented with dextrose.This included yeast dropout supplement Y2001 (1.39 g/L), yeast nitrogenbase without amino acids (6.7 g/L), adenine hemisulfate (0.022 g/L),histidine (0.076 g/L), leucine (0.38 g/L), tryptophan (0.076 g/L), anddextrose (20 g/L). Cultures were grown overnight before seeding theentire 10 mL into a 100 mL minimal media/dextrose culture. Following 8hours of growth, each construct was seeded into two 2 L flaskscontaining 1 L of minimal media supplemented with 20 g of raffinose(VWR). These were grown overnight and induced with 100 mL of 30%galactose for eight hours before harvesting. Cultures were spun down ina Beckman Coulter Avanti J-E centrifuge at 6,000 r.p.m. for 20 minuteswith pellets stored at −80° C.

Purification Methods

Whole-Transcriptome Sequencing to Evaluate ADAR Editing Specificity

For analyzing off-target RNA editing sites across the transcriptome,Applicants harvested total RNA from cells 48 hours post-transfectionusing the RNeasy Plus Miniprep kit (Qiagen). The mRNA fraction was thenenriched using a NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB)and this RNA was then prepared for sequencing using an NEBNext Ultra RNALibrary Prep Kit for Illumina (NEB). The libraries were then sequencedon an Illumina NextSeq and loaded such that there were at least 5million reads per sample.

RNA Editing Analysis for Targeted and Transcriptome-Wide Experiments

Analysis of the transcriptome-wide editing RNA sequencing data wasperformed on the FireCloud computational framework(software/broadinstitute.org/firecloud/) using a custom workflowApplicants developed:portal.firecloud.org/methods/m/rna_editing_final_workflow/rna_editing_final_workflow/l.For analysis, unless otherwise denoted, sequence files were randomlydownsampled to 5 million reads. An index was generated using the RefSeqGRCh38 assembly with Gluc and Cluc sequences added, and reads werealigned and quantified using Bowtie/RSEM version 1.3.0. Alignment BAMswere then sorted and analyzed for RNA editing sites using REDitools (35,36) with the following parameters: -t 8 -e -d -l -U [AG or TC or CT orGA]-p -u -m20 -T6-0 -W -v l -n 0.0. Any significant edits found inuntransfected or EGFP-transfected conditions were considered to be SNPsor artifacts of the transfection and filtered out from the analysis ofoff-targets. Off-targets were considered significant if the Fisher'sexact test yielded a p-value less than 0.05 after multiple hypothesiscorrection by Benjamini Hochberg correction and at least 2 of 3biological replicates identified the edit site. Overlap of edits betweensamples was calculated relative to the maximum possible overlap,equivalent to the fewer number of edits between the two samples. Thepercentage of overlapping edit sites was calculated as the number ofshared edit sites divided by minimum number of edits of the two samples,multiplied by 100. An additional layer of filtering for known SNPpositions was performed using the Kaviar (37) method for identifyingSNPs.

Differential Gene Expression Analysis

Stat Phenotype Assay

Cells were transfected with RESCUE plasmids, guide plasmids targetingresidues on STAT3 and STAT1, and a luciferase reporter for STAT3 (QiagenCignal STAT3 Reporter) and STAT1 signaling (Qiagen Cignal GAS Reporter)using lipofectamine 2000, as described above and incubated for 48 hours.After 48 hours, the Dual-Glo Luciferase Assay (Promega) was used tomeasure firefly and renilla luciferase activity in the cells. Thefirefly signal was normalized to the renilla signal to measure therelative activation of STAT3 and STAT1.

Beta Catenin Phenotype Assay

Cells were plated 24 hours prior to transfection in cell migrationplates containing cores that prevent cell growth in the center of thewell. After 24 hours, cells were transfected with RESCUE plasmids, guideplasmids targeting residues on Beta-catenin, and a luciferase reporterfor Beta-catenin activation (Qiagen TCF/LEF Cignal Reporter) usinglipofectamine 2000, as described above and incubated. After 24 hours,central cores were removed to allow for cell growth towards the centerof the well. After another 24 hours of incubation, media was assayed forGluc and Cluc luciferase signal. The relative ratio of Gluc to Cluc wascalculated to determine the relative Beta catenin activation betweenconditions. On day 3 cells were incubated for 10 minutes withCellTracker Green CMFDA Dye (ThermoFisher Scientific) and then washedwith media. Cells were imaged daily using fluorescence to measure cellgrowth. Cell growth into the central area of the well was measured usingImageJ software by calculating the total area of fluorescence in thecentral growth region. Images were processed using an automated macrowith the following commands:

//ImageJ macro for calculating cellular area run(“8-bit”); run(“Auto  Local   Threshold”,   “method=Bernsen   radius=15   parameter_1=0parameter_2=0 white”); setAutoThreshold(“Default dark”); run(“Measure”);

REFERENCES

-   1. S. Shmakov et al., Discovery and Functional Characterization of    Diverse Class 2 CRISPR-Cas Systems. Mol Cell 60, 385-397 (2015).-   2. S. Shmakov et al., Diversity and evolution of class 2 CRISPR-Cas    systems. Nat Rev Microbiol 15, 169-182 (2017).-   3. A. A. Smargon et al., Cas13b Is a Type VI-B CRISPR-Associated    RNA-Guided RNase Differentially Regulated by Accessory Proteins    Csx27 and Csx28. Mol Cell 65, 618-630 e617 (2017).-   4. O. O. Abudayyeh et al., C2c2 is a single-component programmable    RNA-guided RNA-targeting CRISPR effector. Science 353, aaf5573    (2016).-   5. S. Konermann et al., Transcriptome Engineering with RNA-Targeting    Type VI-D CRISPR Effectors. Cell 173, 665-676 e614 (2018).-   6. W. X. Yan et al., Cas13d Is a Compact RNA-Targeting Type VI    CRISPR Effector Positively Modulated by a WYL-Domain-Containing    Accessory Protein. Mol Cell 70, 327-339 e325 (2018).-   7. A. East-Seletsky et al., Two distinct RNase activities of    CRISPR-C2c2 enable guide-RNA processing and RNA detection. Nature    538, 270-273 (2016).-   8. J. S. Gootenberg et al., Nucleic acid detection with    CRISPR-Cas13a/C2c2. Science 356, 438-442 (2017).-   9. O. O. Abudayyeh et al., RNA targeting with CRISPR-Cas13. Nature    550, 280-284 (2017).-   10. A. East-Seletsky, M. R. O'Connell, D. Burstein, G. J.    Knott, J. A. Doudna, RNA Targeting by Functionally Orthogonal Type    VI-A CRISPR-Cas Enzymes. Mol Cell 66, 373-383 e373 (2017).-   11. D. B. T. Cox et al., RNA editing with CRISPR-Cas13. Science 358,    1019-1027 (2017).-   12. J. S. Gootenberg et al., Multiplexed and portable nucleic acid    detection platform with Cas13, Cas12a, and Csm6. Science 360,    439-444 (2018).-   13. H. Nishimasu et al., Crystal structure of Cas9 in complex with    guide RNA and target DNA. Cell 156, 935-949 (2014).-   14. T. Yamano et al., Crystal Structure of Cpf1 in Complex with    Guide RNA and Target DNA. Cell 165, 949-962 (2016).-   15. L. Holm, L. M. Laakso, Dali server update. Nucleic Acids Res 44,    W351-355 (2016).-   16. H. Yang, P. Gao, K. R. Rajashankar, D. J. Patel, PAM-Dependent    Target DNA Recognition and Cleavage by C2c1 CRISPR-Cas Endonuclease.    Cell 167, 1814-1828 e1812 (2016).-   17. L. Liu et al., Two Distant Catalytic Sites Are Responsible for    C2c2 RNase Activities. Cell 168, 121-134 e112 (2017).-   18. L. Liu et al., The Molecular Architecture for RNA-Guided RNA    Cleavage by Cas13a. Cell 170, 714-726 e710 (2017).-   19. G. J. Knott et al., Guide-bound structures of an RNA-targeting    A-cleaving CRISPR-Cas13a enzyme. Nat Struct Mol Biol 24, 825-833    (2017).-   20. N. F. Sheppard, C. V. Glover, 3rd, R. M. Terns, M. P. Terns, The    CRISPR-associated Csx1 protein of Pyrococcus furiosus is an    adenosine-specific endoribonuclease. RNA 22, 216-224 (2016).-   21. Z. Wu, H. Yang, P. Colosi, Effect of genome size on AAV vector    packaging. Mol Ther 18, 80-86 (2010).-   22. X. J. Lu, H. J. Bussemaker, W. K. Olson, DSSR: an integrated    software tool for dissecting the spatial structure of RNA. Nucleic    Acids Res 43, e142 (2015).-   23. I. Fonfara, H. Richter, M. Bratovic, A. Le Rhun, E. Charpentier,    The CRISPR-associated DNA-cleaving enzyme Cpf1 also processes    precursor CRISPR RNA. Nature 532, 517-521 (2016).-   24. D. Milburn, R. A. Laskowski, J. M. Thornton, Sequences annotated    by structure: a tool to facilitate the use of structural information    in sequence analysis. Protein Eng 11, 855-859 (1998).-   25. I. M. Slaymaker et al., Rationally engineered Cas9 nucleases    with improved specificity. Science 351, 84-88 (2016).-   26. L. Gao et al., Engineered Cpf1 variants with altered PAM    specificities. Nat Biotechnol 35, 789-792 (2017).

Example 14—Transformation of the Adenine Deaminase ADAR2 into a CytosineDeaminase for Programmable RNA Editing

Programmable RNA editing can enable reversible recoding of RNAinformation for research and disease treatment. Here, this example showsa C to U RNA editor, referred to as RNA Editing for Specific C to UExchange (RESCUE), by directly evolving ADAR2 into a cytidine deaminase.RESCUE doubled the number of pathogenic mutations targetable by RNAediting and enables modulation of phosphosignaling-relevant residues,such as threonine and serine. Applicants applied RESCUE to driveβ-catenin activation and cellular growth. Furthermore, RESCUE retained Ato I editing activity, enabling multiplexed C to U and A to I editingthrough the use of tailored guide RNAs.

In summary, this example shows a programmable cytidine to uridine RNAediting with a directly evolved ADAR2 fused to CRISPR-Cas13 expands theRNA editing toolbox.

Applicants previously developed a RNA base editing technology calledREPAIR (RNA editing for programmable A to I (G) replacement), which usesthe RNA targeting CRISPR effector Cas13 (1-6) to direct the catalyticdomain of ADAR2 to specific RNA transcripts to achieve adenine toinosine conversion with single-base precision (7). Technologies forprecise RNA editing of cytidine to uridine would greatly expand therange of addressable disease mutations as well as allow for signalingpathway modulation in cells via alteration of post-translationalmodification sites (FIG. 107A).

Although natural enzymes capable of catalyzing C to U conversion havebeen harnessed for DNA base editing (16, 17), they only operate onsingle stranded substrates (18), exhibit off-targets across both thegenome and transcriptome (19-21), and deaminate multiple bases within awindow. In this example, Applicants took a synthetic approach to evolvethe adenine deaminase domain of ADAR2 (ADAR2dd), which naturally acts ondouble-stranded RNA substrates and preferentially deaminates a targetadenine mispaired with a cytidine, into a cytidine deaminase. Applicantsfused this evolved cytidine deaminase to dCas13 to develop programmableRNA Editing for Specific C to U Exchange (RESCUE) in mammalian cells(FIG. 107B), which Applicants used to edit phosphorylation signaling ofSTAT and β-catenin proteins and modulate cell growth. Lastly, Applicantsdemonstrated multiplexed A to I and C to U base conversions with RESCUEand improved the specificity of RESCUE more than 10-fold via rationalmutagenesis, generating a highly specific and precise C to U RNA editingtool.

Based on the comparison of the E. coli cytidine deaminase and the humanADAR2dd showed remarkable structural homology between their catalyticcores (22) (FIG. 107B), Applicants selected residues of ADAR2ddcontacting the RNA substrate (23) for three rounds of rationalmutagenesis on an ADAR2dd fused to the catalytically inactive Cas13bortholog from Riemerella anatipestifer (dRanCas13b), yielding RESCUEround 3 (RESCUEr3), with 15% editing activity (FIGS. 103A-103B, 108,109A-109B). Applicants then began directed evolution across ADAR2dd toidentify additional candidate mutations that increase the activity ofRESCUE in yeast.

Sixteen rounds of evolution, culminating with the final constructRESCUEr16 (hereafter referred to as just RESCUE), resulted in increasedcytidine deamination activity across all motifs tested, with higher than20% editing on 12 out of 16 possible motif combinations of theimmediately neighboring 5′ and 3′ bases (FIG. 103C, 110, 111A-111C, 112,113A-113E). Applicants additionally characterized guide featuresnecessary for robust activity, finding that RESCUE was optimally activewith C or U base-flips across the target base using a 30-nt guide (FIG.103C, 114A-114C, 115). Moreover, as dRanCas13b and the catalyticallyinactive Cas13b ortholog from Prevotella sp. P5-125 (dPspCas13b) wereequivalent, the final RESCUE construct used dRanCas13b (FIG. 116).

The 16 mutations in RESCUE are distributed throughout the structure ofADAR2dd (FIG. 117A), indicating both direct interactions of the evolvedresidues with the RNA target within the catalytic pocket as well asindirect effects (FIG. 117B). These mutations enabled fitting of eitheradenosine or cytidine, as RESCUE was capable of both adenosine andcytidine deamination (FIGS. 108A-108D). Applicants evaluated the role ofeach mutant by individually adding them to REPAIR or removing them fromRESCUE. (FIGS. 119A-119D). Applicants found that mutations in thecatalytic core (V351G, K350I) and contacting the RNA target (S486A,S495N) were important to RESCUE activity. Biochemical characterizationof RESCUE mutations on purified ADAR2dd showed no activity on dsDNA,ssDNA, or DNA-RNA heteroduplexes, with the evolved mutations improvingthe kinetics of C to U editing on dsRNA substrates in vitro (FIGS.120A-120D).

As ADAR2 has been employed in other RNA editing platforms without Cas13(8, 9, 11, 13), Applicants assayed C to U activity in in the absence ofa Cas13 fusion. Applicants introduced the RESCUE mutations into bothADAR2dd or the full-length ADAR2 protein in mammalian cells along with aguide RNA and assayed the ability of these constructs to restoreluciferase activity, finding that the complete RESCUE construct,including the guide RNA direct repeat, was necessary for both adenosineand cytidine deamination activity (FIG. 103D, FIG. 121A-121D, 122A-122C,123A-123C). To test C to U editing in alternative RNA editing systems,which rely on recruitment of MS2-ADAR2dd fusions (24) or full lengthADAR2 recruitment with RNA guides (11, 24), Applicants introduced theRESCUE mutations into these constructs and found that editing efficiencywas markedly reduced compared to Cas13b-based RESCUE (FIGS. 124A-124F).

Applicants next evaluated the efficiency of RESCUE on endogenoustranscripts in HEK293FT cells via bulk sequencing of cell populations.Applicants tested a variety of guide designs across 24 different sitesacross nine genes as well as on 24 synthetic disease-relevant mutationtargets from ClinVar and found editing rates up to 42% (FIG. 103E, FIGS.125A-125C, 126A-126B, 127A-127B, 130; Table 28). Across the guidestested (Tables 29-31), Applicants found multiple guide design rules,most notably related to features of the motif (5′ U or A preferred) andguide mismatch position.

To demonstrate control of signaling pathways via RNA editing ofpost-translational modification sites, Applicants altered activation ofthe STAT and Wnt/β-catenin pathways via modulation of keyphosphorylation residues (FIG. 104A, 129A-129F). Mutating phosphorylatedresidued on β-catenin, such as S33, S37, and T41, inhibitedubiquitination and degradation, allowing the protein to engagetranscription factors like LEF and TCF1/2/3 and leading to increasedcell proliferation (25) (FIG. 104B). Applicants tested a panel of guidestargeting the β-catenin transcript (CTNNB1) at residues known to bephosphorylated and observed editing levels between 5% and 28% (FIG.104C), resulting in up to 5-fold activation of Wnt/β-catenin signaling(FIG. 104D) and increased cell growth in HEK293FT (FIGS. 104E-104F) andhuman umbilical vein endothelial cells (HUVECs) (FIGS. 130A-130B). Astherapeutic applications with RESCUE may benefit from shorter constructsfor viral delivery, Applicants also evaluated RESCUE activity withC-terminal truncations of dRanCas13b and found either similar orimproved deaminase activity (FIG. 131).

Since RESCUE retained adenosine deaminase activity (FIGS. 118A-118D),the native pre-crRNA processing activity of Cas13b (4) enabledmultiplexed adenine and cytosine deamination. By delivering RESCUE alongwith a pre-crRNA targeting an adenine and a cytosine in the CTNNB1transcript (FIG. 105A), Applicants found that RESCUE could edit bothtargeted residues S33F and T41A at rates of ˜15% and 5%, respectively(FIG. 105B). However, in these experiments, as well as single-plexassays, Applicants found A to I off-targets near the targeted cytosine(FIGS. 132A-132C, 133A-133D). To eliminate these off-targets, Applicantsintroduced disfavorable guanine mismatches in the guide across fromoff-target adenosines (FIG. 105C), significantly reducing off-targetediting while minimally disrupting the on-target editing (FIG. 105D).

Applicants profiled off-targets with whole-transcriptome RNA-sequencing,finding that while RESCUE had ˜80% C to U editing on the Gluc transcript(FIG. 106A), it had 188 C to U off-targets and 1,695 A to I off-targets,comparable to A to I off-targeting with REPAIRv1 (7)(FIGS. 108A, 108B).To improve the specificity of RESCUE Applicants performed rationalmutagenesis of ADAR2dd at residues interacting with the RNA target (FIG.106C), resulting in multiple RESCUE mutants with reduced A to Ioff-target activity and high C to U on-target deamination activity, asmeasured by a luciferase reporter (FIG. 106D) and RNA sequencing (FIGS.106E-106G). The top specificity mutant, S375A on RESCUE (hereafterreferred to as RESCUE-S), maintained ˜76% on-target C to U editing (FIG.106E), but only had 103 C to U off-targets and 139 A to I off-targets,an approximate 10-fold reduction in the number of adenine deaminationoff-targets (FIGS. 106E-106G), with diminished missense mutations anddifferentially-regulated transcripts (FIGS. 134A-134F, 135A-135C,136A-136B, 137A-137D). Applicants also found that RESCUE-S retainedsimilar C to U activity as RESCUE at many endogenous sites, evenexceeding it at some sites (FIGS. 138A-138C, 139A-139C, 140A) withhigher specificity within the local guide window (FIG. 138C, 140B-140E).

RESCUE was a programmable base editing tool capable of precise cytidineto uridine conversion in RNA. Using directed evolution, Applicantsdemonstrated that adenosine deaminases can be relaxed to accept otherbases, resulting in a novel cytidine deamination mechanism that can editdsRNA. While in the present study Applicants took advantage of theRNA-guided targeting mechanism of Cas13, other RNA targeting mechanisms(8-15, 24) can similarly be combined with evolved ADAR2dd mutants toachieve precise cytidine deamination on RNA transcripts. The largertargetable amino acid codon space of RESCUE's cytidine deaminationactivity enabled modulation of more post-translational modifications,such as phosphorylation, glycosylation, and methylation, as well asexpanded targeting of common catalytic residues (FIGS. 107 and 141).Moreover, cytidine deaminase-mediated RNA editing allowed for additionaltargeting of disease-associated mutations and generation of protectivealleles, such as ApoE2. Overall, RESCUE extended the RNA targetingtoolkit with new base editing functionality, allowing for expandedmodeling and potential treatment of genetic diseases.

REFERENCES

-   1. O. O. Abudayyeh et al., C2c2 is a single-component programmable    RNA-guided RNA-targeting CRISPR effector. Science 353, aaf5573    (2016).-   2. C. Cassidy-Amstutz et al., Identification of a Minimal Peptide    Tag for in Vivo and in Vitro Loading of Encapsulin. Biochemistry 55,    3461-3468 (2016).-   3. S. Shmakov et al., Discovery and Functional Characterization of    Diverse Class 2 CRISPR-Cas Systems. Mol Cell 60, 385-397 (2015).-   4. A. A. Smargon et al., Cas13b Is a Type VI-B CRISPR-Associated    RNA-Guided RNase Differentially Regulated by Accessory Proteins    Csx27 and Csx28. Mol Cell 65, 618-630 e617 (2017).-   5. A. East-Seletsky et al., Two distinct RNase activities of    CRISPR-C2c2 enable guide-RNA processing and RNA detection. Nature    538, 270-273 (2016).-   6. O. O. Abudayyeh et al., RNA targeting with CRISPR-Cas13. Nature    550, 280-284 (2017).-   7. D. B. T. Cox et al., RNA editing with CRISPR-Cas13. Science 358,    1019-1027 (2017).-   8. T. Merkle et al., Precise RNA editing by recruiting endogenous    ADARs with antisense oligonucleotides. Nat Biotechnol 37, 133-138    (2019).-   9. P. Vogel et al., Efficient and precise editing of endogenous    transcripts with SNAP-tagged ADARs. Nat Methods 15, 535-538 (2018).-   10. M. Fukuda et al., Construction of a guide-RNA for site-directed    RNA mutagenesis utilizing intracellular A-to-I RNA editing. Sci Rep    7, 41478 (2017).-   11. J. Wettengel, P. Reautschnig, S. Geisler, P. J. Kahle, T.    Stafforst, Harnessing human ADAR2 for RNA repair—Recoding a PINK1    mutation rescues mitophagy. Nucleic Acids Res 45, 2797-2808 (2017).-   12. M. F. Montiel-Gonzalez, I. C. Vallecillo-Viejo, J. J. Rosenthal,    An efficient system for selectively altering genetic information    within mRNAs. Nucleic Acids Res 44, e157 (2016).-   13. P. Vogel, M. F. Schneider, J. Wettengel, T. Stafforst, Improving    site-directed RNA editing in vitro and in cell culture by chemical    modification of the guideRNA. Angew Chem Int Ed Engl 53, 6267-6271    (2014).-   14. M. F. Montiel-Gonzalez, I. Vallecillo-Viejo, G. A.    Yudowski, J. J. Rosenthal, Correction of mutations within the cystic    fibrosis transmembrane conductance regulator by site-directed RNA    editing. Proc Natl Acad Sci USA 110, 18285-18290 (2013).-   15. H. A. Rees, D. R. Liu, Base editing: precision chemistry on the    genome and transcriptome of living cells. Nat Rev Genet 19, 770-788    (2018).-   16. A. C. Komor, Y. B. Kim, M. S. Packer, J. A. Zuris, D. R. Liu,    Programmable editing of a target base in genomic DNA without    double-stranded DNA cleavage. Nature 533, 420-424 (2016).-   17. K. Nishida et al., Targeted nucleotide editing using hybrid    prokaryotic and vertebrate adaptive immune systems. Science 353,    (2016).-   18. J. D. Salter, R. P. Bennett, H. C. Smith, The APOBEC Protein    Family: United by Structure, Divergent in Function. Trends Biochem    Sci 41, 578-594 (2016).-   19. S. Jin et al., Cytosine, but not adenine, base editors induce    genome-wide off-target mutations in rice. Science, (2019).-   20. E. Zuo et al., Cytosine base editor generates substantial    off-target single-nucleotide variants in mouse embryos. Science,    (2019).-   21. J. Grunewald et al., Transcriptome-wide off-target RNA editing    induced by CRISPR-guided DNA base editors. Nature, (2019).-   22. M. R. Macbeth et al., Inositol hexakisphosphate is bound in the    ADAR2 core and required for RNA editing. Science 309, 1534-1539    (2005).-   23. M. M. Matthews et al., Structures of human ADAR2 bound to dsRNA    reveal base-flipping mechanism and basis for site selectivity.    Nature structural & molecular biology 23, 426-433 (2016).-   24. D. Katrekar et al., In vivo RNA editing of point mutations via    RNA-guided adenosine deaminases. Nat Methods 16, 239-242 (2019).-   25. B. T. MacDonald, K. Tamai, X. He, Wnt/beta-catenin signaling:    components, mechanisms, and diseases. Dev Cell 17, 9-26 (2009).-   26. M. K. Chee, S. B. Haase, New and Redesigned pRS Plasmid Shuttle    Vectors for Genetic Manipulation of Saccharomyces cerevisiae. G3    (Bethesda) 2, 515-526 (2012).-   27. M. F. Laughery et al., New vectors for simple and streamlined    CRISPR-Cas9 genome editing in Saccharomyces cerevisiae. Yeast 32,    711-720 (2015).-   28. M. R. Macbeth, B. L. Bass, Large-scale overexpression and    purification of ADARs from Saccharomyces cerevisiae for biophysical    and biochemical studies. Methods Enzymol 424, 319-310 (2007).-   29. H. Ng, N. Dean, Dramatic Improvement of CRISPR/Cas9 Editing in    Candida albicans by Increased Single Guide RNA Expression. mSphere    2, (2017).-   30. R. Heim, D. C. Prasher, R. Y. Tsien, Wavelength mutations and    posttranslational autoxidation of green fluorescent protein. Proc    Natl Acad Sci USA 91, 12501-12504 (1994).-   31. Y. Wang, P. A. Beal, Probing RNA recognition by human ADAR2    using a high-throughput mutagenesis method. Nucleic Acids Res 44,    9872-9880 (2016).-   32. R. D. Gietz, R. H. Schiestl, Large-scale high-efficiency yeast    transformation using the LiAc/SS carrier DNA/PEG method. Nat Protoc    2, 38-41 (2007).-   33. M. T. Veeman, D. C. Slusarski, A. Kaykas, S. H. Louie, R. T.    Moon, Zebrafish prickle, a modulator of noncanonical Wnt/Fz    signaling, regulates gastrulation movements. Curr Biol 13, 680-685    (2003).

Materials and Methods

Design and Cloning of Yeast Constructs

For expression of the dRanCas13b-hADAR2dd construct in yeast, the fusionprotein was cloned downstream of a pGAL promoter in a pRSII426 backbone(26), by modifying pML104 (Addgene #67638) (27). To improve expression,a GS linker was cloned between the fusion proteins, and ADAR2dd wascodon optimized for yeast (28). Additional codon mutations,corresponding to rounds of RESCUE, were introduced via Gibson Cloning.

Targeting plasmids for testing activity in yeast were engineered forboth fluorescent screens (GFP) and auxotrophic selection screens (His).All targeting plasmids were cloned into the pYES3/CT backbone (ThermoScientific). All plasmids contained a RanCas13b guide cassette forRESCUE, with expression driven by the ADH1 promoter, and spacer and DRsequences flanked by HH and HDV ribozymes (29). A construct with thespacer replaced by a golden gate site was cloned to facilitate modularguide cloning.

To generate a GFP indicator of C to U RNA editing activity, the Y66Hgreen-to-blue mutation (30) was introduced into a yeast codon optimizedEGFP (yeGFP) (31) driven by the TEF promoter. Successful C to U RNAediting restores the green fluorescence of this construct. His reportersfor C to U editing were generated by testing conserved residues in HIS3for loss of activity when mutated to residues that could be rescued byRNA editing (FIG. 128). Mutations that created inactive HIS3 were clonedinto a HIS3 gene, under its native HIS3 promoter, in the pYES3/CTbackbone.

All yeast plasmids are listed in Table 33, and all targeting guides usedin yeast experiments are listed in Table 34.

RESCUE Directed Evolution

To select for C to U activity in yeast, Applicants engineered a set ofyeast reporter assays based on either restoration of GFP fluorescence orprototrophic reversion of a HIS auxotrophic selection gene. SequencingGFP positive cultures or colonies that survived in the absence ofhistidine elected individual mutations in the ADAR2dd domain, which wereintroduced onto the previous RESCUE candidate round and evaluated foractivity in mammalian cells using various reporter constructs. Afteroptimizing luciferase activity on the UCG luciferase site (C82R) for 11rounds, Applicants switched to optimizing at the T41 site on the CTNNB1transcript for two rounds and then the CCU site (L77P) on the Gluctranscript for another two rounds. In the final round, Applicants testedfor restoration of activity of luciferase mutants with all four possible5′ bases at the Gluc C82R site (UCG, ACG, CCG, and GCG) and twoadditional motifs (CCU and CCA) at the Gluc L77P mutation, findingincreases in activity with these motifs (FIG. 103B, FIG. 110). Tofurther validation our RESCUEr versions from the directed evolutionpipeline in our yeast system, Applicants tested multiple RESCUEriterations for both activity in yeast and in vitro assays (FIGS.113A-113E and 120A-120D). Testing both EGFP and His restoration inyeast, Applicants found that later versions of RESCUEr could moreeffectively perform C to U editing on both targets (FIGS. 113A-113E).After each round of yeast screening, top mutations were evaluated on aseries of mammalian reporters to validate activity and select the topmutant for the next round of yeast screening. All screens and resultingmutations are listed in Table 27.

Generation of Mutagenesis Libraries for Yeast Screening

To generate mutagenesis libraries for screening mutations in yeastsystems, the hADAR2 deaminase domain was mutated using Genemorph II(Agilent Technologies) for error-prone PCR across eight 50 mL reactionsranging in template input from 74 ng-9.4 μg via a two-fold dilutionseries. Following amplification, reactions were pooled, diluted 1:4 inDI water and loaded into a 2% gel containing ethidium bromide. Extractedsamples were purified using a MinElute PCR Purification Kit (Qiagen)before treatment with Dpn1 (Thermo Fisher Scientific) at 37° C. for 2hto remove residual template plasmid and subsequent gel and MinElutepurification. The backbone for cloning was generated by digesting 7 μgof template plasmid with KflI, RruI, and Eco72I (Thermo FisherScientific) for 1 hour. The digest was gel purified with the MinElutePCR Purification kit and eluted in 30 μL of pre-warmed water.

The purified PCR insert and digested backbone were assembled usingGibson Assembly (New England Biosciences), with 456 ng of PCR insert and800 ng of backbone digest incubated in an 80 μL reaction for 1 hour. Theproduct was pelleted with isopropanol precipitation and resuspended in12 μL of Tris-EDTA buffer via heating to 50° C. for 5 minutes. 50 μL ofEndura Electrocompetent cells (Lucigen) were thawed on ice for 10minutes and 2 μL of resuspended Gibson product was added. The mixturewas electroporated using a GenePulser Xcell (Bio-Rad) following optimalEndura settings (1.0 mm cuvette, 10 μF, 600 Ohm, 1800 V). Samples fromeach electroporation were recovered in 1 mL of Recovery Media (Lucigen)and incubated at 37° C. for 1 hour while shaking at 300 r.p.m. Twoelectroporations were performed per mutagenesis library. The recoveredculture was plated on a large pre-warmed 100 μg/mL ampicillin plate, andplates were incubated at 37° C. for 16 hours before harvesting with theNucleobond Xtra Maxi Kit (Macherey-Nagel).

Transformation of Mutagenesis Libraries in Yeast

All yeast experiments were performed using INVSc1 (ThermoFisherScientific). Large scale yeast transformation was carried out aspreviously described (32). Briefly, colonies containing the Y66H EGFP orHIS3 reporter plasmids were picked into 300 mL -Trp 2% glucose selectionmedia and grown up overnight at 30° C. After growth, the OD600 of thecells were determined and 2.5e9 cells were added to 500 mL of pre-warmed2×YPAD and incubated for 4 hours at 30° C. The cell pellet was washedmultiple times and then resuspended in 36 mL of transformation mixcontaining 24 mL of PEG 3350 (50% w/v), 3.6 mL of 1.0 M Lithium acetate,5 mL of denatured single-stranded carrier salmon sperm DNA at 2.0 mg/mL(ThermoFisher Scientific), 2.9 mL of water, and 500 μL of 1 μg/μLplasmid library. After incubation at 42° C. for 60 minutes, the cellpellet was resuspended in 750 mL of -Ura/-Trp 2% glucose selection mediaand grown overnight until the culture reached OD600 of 5-6. At thatpoint, 6 mL of the culture was seeded into 250 mL of 2% raffinose-Ura/-Trp selection media and incubated until the OD600 was 0.5-1.Cultures were induced by adding 27 mL of 30% galactose and incubatedovernight at 30° C. for 12-14 hours. Cells were then either subjected tocell sorting or plating on selection plates, as described below. Anyvalidation experiments involving single mutants were transformed in asimilar way, but using a scaled down version of the large-scaletransformation above.

Fluorescent Cell Sorting of Yeast Libraries

After induction, cells were sorted on a SH800S Cell Sorter by gating forEGFP fluorescence compared to a negative non-induced and non-targetingguide control. After 100 million cells had been sorted into 2% glucose-Ura/-Trp selection media, sorted cells were incubated overnight andthen diluted 1:40 into 2% raffinose -Ura/-Trp selection media at anOD600 of 5-6. Cells were returned to the shaker, induced with galactoseat an OD600 between 0.5-1, and incubated overnight for 12-14 hoursbefore sorting again. Sorting was performed until 10-20 million cellshad been sorted. Iterative growth and sorting was repeated 2-3additional times, with each iteration of sorted cells harvested forplasmid with Zymoprep Yeast Plasmid Miniprep II (Zymo). The Adar2ddregion of the plasmid was PCR amplified and sequenced by Ilumina NextSeqNGS to ascertain the mutants present at each round of selection. Topenriched mutants were individually ordered and cloned for mammalianvalidation testing as described below.

His Growth Selection of Yeast Libraries

After induction, the cell library was plated on 2% raffinose/3%galactose -Ura/-Trp/-His selection plates. As colonies grew, they werepicked into water and streaked on 2% raffinose/3% galactose-Ura/-Trp/-His selection plates. After overnight growth of the streaks,colony PCR was performed on each streak and subjected to sangersequencing of the ADAR2 catalytic domain as well as the His gene tocheck for recombination and DNA mutagenesis. Mutations were individuallyordered and cloned for mammalian validation testing as described below.

Design and Cloning of Mammalian Constructs for RNA Editing

RanCas13b was made catalytically inactive (dRanCas13b) via histidine toalanine and arginine to alanine mutations (R142A/H147A/R1039A/H1044A) atthe catalytic site of the HEPN domains. The deaminase domain and ADAR2were synthesized and PCR amplified for Gibson cloning into pcDNA-CMVvector backbones and were fused to dRanCas13b at the C-terminus via aGS-mapkNES-GS (GSSLQKKLEELELGS (SEQ ID NO:779)) linker. Mutations in theADAR2 deaminase domain for altering cytosine deamination activity orspecificity were introduced by Gibson cloning into thedRanCas13b-GS-mapkNES-GS-ADAR2dd backbone. All mutations introduced intoADAR2dd for evolving C to U editing are listed in Table 27.

For comparison between different Cas13b orthologs, mutations tested onthe dRanCas13b backbone were transferred to a dPspCas13b fusion vectorby Gibson cloning onto the REPAIR construct (7),dPspCas13b-GS-HIVNES-GS-ADAR2dd. For testing the ADAR2dd alone withoutdRanCas13b and the full length ADAR2, Applicants used Gibson cloning toadd all mutations to pcDNA-CMV vector backbones with ADAR2dd or fulllength ADAR2, previously cloned to test REPAIR (7). Luciferase reportervectors for measuring C to U RNA editing activity were generated byscreening potential mutations in Gluc in the previously reportedluciferase reporter plasmid (7). This reporter vector expressesfunctional Cluc as a normalization control, but a defective Gluc due tothe addition of mutants (either C82R or L77P). To test RESCUE editingmotif preferences, Applicants cloned every possible motif around thecytosine at codon 82 (AAX CXC) of Gluc. Mutants were evaluated for C toU editing of C82R and restoration of catalytic activity (33). As thesurrounding motif strongly determines RNA editing efficiency for A to Iediting (7), Applicants initially targeted a UCG site since a 5′U and3′G are the preferred flanking bases for ADAR2dd optimal activity.Secreted luciferase reporter vectors for testing CTNNB1 editingefficiency were generated from M50 Super 8× TOPFlash (Addgene #12456)and M50 Super 8× FOPFlash (Addgene #12457) (34). The original fireflyluciferase, under control of either TCF/LEF responsive elements(TOPFlash) or mock binding sites (FOPFlash) was replaced with a secretedGaussia luciferase via Gibson cloning. An additional Cypridinaluciferase with expression drive by a CMV promoter was cloned in toserve as a transfection control. All mammalian plasmids are listed inTable 32.

Selection of Candidate Rounds in Mammalian Cells

Mutations that performed comparable or better to the existing candidateround were selected for screening on the entire panel of 6 luciferasereporters. For the selection of RESCUEr4 through RESCUEr10, candidatemutations were initially screened on TCG motifs; candidate roundRESCUEr11 was isolated using GCG motifs as the initial screening.Selection of candidate rounds RESCUEr12 through RESCUEr14 were validatedin mammalian cells using an initial screening on editing of the T41Iresidue of endogenous CTNNB1, resulting in β-catenin pathway activationthat was profiled with luminescent reporters of pathway activity, andcandidate rounds RESCUEr15 and RESCUEr16 were selected via activity onthe L77P CCT motif of Gluc. All rounds and yeast screens used togenerate them are listed in Table 27.

Cloning Pathogenic U>C Mutations for Assaying RESCUE Activity

To generate disease-relevant mutations for testing REPAIR activity, 23U>C mutations related to disease pathogenesis, as defined in ClinVar,were selected (grouped as a panel of 22 genes and ApoE independently).Selected targets were ordered from Integrated DNA Technologies as 200-bpregions surrounding the mutation site, and were cloned downstream ofmScarlet under a Eflalpha promoter.

Guide Cloning for RESCUE

For expression of mammalian guide RNAs for RESCUE, a previouslydescribed construct (7) with a RanCas13b direct repeat sequence precededby golden-gate acceptor sites under U6 expression was used. Individualguides were cloned into this expression backbone by golden-gate cloning.To determine optimal guides for select sites, both C and U flips weretested, as well as tiling guides around the most common optimal guiderange (mismatch distance of ˜24). Guide sequences for RESCUE experimentsare listed in Tables 29-31.

Mammalian Cell Culture

Unless otherwise stated, mammalian cell culture experiments wereperformed in the HEK293FT line (American Type Culture Collection(ATCC)), grown in Dulbecco's Modified Eagle Medium containing glucose,sodium pyruvate, and GlutaMAX (Thermo Fisher Scientific), andsupplemented with 1× penicillin-streptomycin (Thermo Fisher Scientific)and 10% fetal bovine serum (VWR Seradigm). Cells were maintained atconfluency below 80%.

Unless otherwise noted, all transfections were performed withLipofectamine 2000 (Thermo Fisher Scientific) in 96-well plates coatedwith poly-D-lysine (BD Biocoat). Cells were plated at approximately20,000 cells/well 16 hours prior to transfection to ensure 90%confluency at the time of transfection. For each well on the plate,transfection plasmids were combined with Opti-MEM I Reduced Serum Medium(Thermo Fisher Scientific) to a total of 25 μl. Separately, 24.5 μl ofOpti-MEM was combined with 0.5 μl of Lipofectamine 2000. Plasmid andLipofectamine solutions were then combined and incubated for 5 minutes,after which they were pipetted onto cells.

HUVEC cells (Lonza) were cultured in Endothelial Growth Media-2 (Lonza)on Nunc Collagen I Coated EasYFlasks (Thermo Fisher Scientific). Cellswere maintained at confluency below 80%. HUVEC transfections wereperformed with Lipofectamine LTX (Thermo Fisher Scientific) in 96-wellplates coated with Collagen I (BD Biocoat). Cells were plated atapproximately 5,000 cells/well 16 hours prior to transfection. Culturemedia was replaced with fresh EGM-2 immediately before transfection. Foreach well on the plate, transfection plasmids were combined with 1 μLPlus reagent and Opti-MEM to a total of 25 μL. Separately, 24.7 μL ofOpti-MEM was combined with 0.3 μL of Lipofectamine LTX. Plasmid and LTXsolutions were then combined and incubated for 25 minutes, after whichthey were pipetted onto cells. After 4 hours, cells were washed with PBSand media was replaced with fresh EGM-2.

RESCUE Editing in Mammalian Cells

To assess RESCUE activity in mammalian cells, Applicants transfected 150ng of RESCUE vector, 300 ng of guide expression plasmid, and, when usinga reporter (either luciferase, STAT activity, or β-catenin activity), 40ng of the RNA editing reporter. After 48 hours, RNA from cells washarvested and reverse transcribed using a method previously described(33) with a gene specific reverse transcription primer. The extractedcDNA was then subjected to two rounds of PCR to add Illumina adaptorsand sample barcodes using NEBNext High-Fidelity 2×PCR Master Mix (NewEngland Biolabs). The library was then subjected to next generationsequencing on an Illumina NextSeq or MiSeq. RNA editing rates were thenevaluated at all adenosines within the sequencing window.

In experiments where the luciferase reporter was targeted for RNAediting, Applicants also harvested the media with secreted luciferaseprior to RNA harvest. Applicants measured luciferase activity withCypridinia and Gaussia luciferase assay kits (Targeting Systems) on aplate reader (Biotek Synergy Neo2) with an injection protocol. Allreplicates performed are biological replicates.

In experiments where the input amount of RESCUE plasmid was varied,total plasmid amount was kept constant by replacing RESCUE expressionplasmid with a filler plasmid expressing a CMV-driven mScarlet, exceptwhere noted. In the experiment where input amount of guide plasmid wasvaried, total plasmid amount was either kept constant (“with fillerplasmid”) via substitution of non-targeting guide, or not kept constant(“without filler plasmid”); in this experiment, there was no fillerplasmid for the RESCUE plasmid.

Considerations for RESCUE Guide Design

Applicants tested a panel of guide RNAs with varying mismatch positionstargeting 24 different sites across nine genes (FIGS. 103E, 125A-125C),specifically choosing varying 5′ base identities to interrogate thedeamination activity on different motifs. Applicants found that RESCUEachieved editing rates up to 35% at all sites tested, and that the idealmismatch position or base-flip (C or U) was site dependent. Moreover,RESCUE outperformed all previous rounds of mutants on multipleendogenous sites and required less transfected plasmid than earlierversions (FIGS. 126A-126B). To better evaluate the relevance of RESCUEfor therapeutics, Applicants designed a series of 24 targets to modelediting of disease-relevant mutations from ClinVar (see Table 28), andfound editing rates up to 42% as measured by bulk sequencing (FIGS.129A-129B), including the Alzheimer's risk related ApoE4 allele (FIG.128).

After analyzing all guides in the paper, Applicants found that theoptimal guide design differs between target sites. Applicants recommendtesting a variety of guide designs per new target site including both Cand U flips as well as varying mismatch positions. An example of designsto test would include a 30 nt guide with C or U flip and mismatches inthe following positions: 28, 26, 24, 22, and 20. Overall, Applicantsfind that any cytidine site that is flanked by a U or A will have robustediting activity. Sites with a 5′ C or G will be edited with lessefficiency.

Biochemical Characterization of RESCUE Mutations on ADAR2dd

To assess kinetic activity of hADAR2 deaminase domain containing RESCUEmutations, multiple iterations were cloned into apGAL-His6-TwinStrep-SUMO-hADAR2dd backbone containing the URA3 gene. Theplasmids were transformed into BCY123 competent yeast cells (10).Briefly, frozen cells were thawed in 37° C. water bath for 15-30seconds. 10 μL of cells per condition were centrifuged at 13,000 g in amicrocentrifuge for 2 minutes and supernatant was removed. The preparedtransformation mix for each construct contained 260 μL PEG 3350 preparedat 50% w/v, 50 μL of denatured salmon sperm (Thermo Fisher Scientific),36 μL 1M Lithium Acetate, and 750 ng of plasmid in 14 μL of DI H2O. Theyeast pellet was resuspended with the transformation mix and incubatedin a 42° C. water bath for 30 minutes before centrifugation at 13,000 gfor 30 seconds and subsequent supernatant removal. The pellet was thenresuspended in 1 mL of DI H2O and 50 μL was taken into 1 mL of DI H2Ofor mixing. Subsequently, 200 μL was plated onto minimal glucose platesminus uracil for prototrophic selection.

Plates were incubated at 30° C. for 48 hr before seeding single coloniesinto 10 mL cultures of yeast minimal media supplemented with dextrose(20 g/L). Minimal media was prepared with yeast dropout supplement Y2001(1.39 g/L), yeast nitrogen base without amino acids (6.7 g/L), adeninehemisulfate (0.022 g/L), histidine (0.076 g/L), leucine (0.38 g/L), andtryptophan (0.076 g/L). Cultures were grown overnight before seeding theentire 10 mL culture into a 100 mL minimal media/dextrose culture.Following 8 hours of growth, each construct was seeded into two 2 Lflasks containing 1 L of minimal media supplemented with 20 g ofraffinose (VWR). These were grown overnight and induced by the additionof 30 g of galactose dissolved in 200 mL of minimal media; cultures werethen grown for an additional eight hours before harvesting. Cultureswere spun down in a Beckman Coulter Avanti J-E centrifuge at 5,000 RPMfor 20 minutes, the resulting pellets were stored at −80° C.

Protein purification of the different RESCUE candidate hADAR2 deaminasedomains was modified from the protocol described in Macbeth and Bass(28). In brief, 5-10 g of frozen yeast pellet was resuspended in 50 mLlysis buffer Lysis buffer (20 mM TrisHCl pH 8, 5% glycerol, 750 mM NaCl,1 mM beta-mercaptoethanol, 0.01% Triton-X) supplemented with one tabletof EDTA-free mini cOmplete ULTRA protease inhibitors (Sigma). Thesuspension was passed seven times through a LM20 microfluidizer at25,000 psi, and the cell debris was pelleted by centrifugation at 9,500RPM for 80 minutes. The cleared lysate was decanted off and incubatedwith 1 mL of StrepTactin superflow resin (Qiagen) for 2.5 hours, gentlyshaking using a rotary shaker at 4° C. The suspension was added to anEcono-column chromatography column pre-equilibrated with lysis buffer,and the resin was washed with 40 mL of lysis buffer. Three subsequentwashes (40 mL each) lowered the salt concentration (500 mM, 250 mM, then100 mM NaCl). Protein was cleaved off the resin by gently shakingovernight on a table shaker in 20 mL of lysis buffer supplemented with100 μg of SUMO protease (in-house). Flow-through was collected andcombined with 3×5 mL washes of the resin with lysis buffer. The entirefraction containing cleaved protein was loaded onto a 5 mL Heparin HPcation exchange column (GE Healthcare Life Sciences), and eluted over aNaCl gradient from 100 mM to 1 M (buffers 20 mM Tris-HCl pH 8, 5%glycerol, 1 mM beta-mercaptoethanol with respective NaCl concentration).Fractions were checked for purity and analyzed using SDS-PAGE andCoomassie staining, and protein containing fractions were pooled andconcentrated using 10 MWCO centrifugal filters (Amicon). Theconcentration in mg/mL of each protein was determined by Coomassiestaining and SDS-PAGE electrophoresis against a serial dilution of BSA(starting at 1 mg/mL). Bands were quantified using ImageLab software(BioRad Image Lab Software 6.0.1), and the concentration was estimatedby interpolation of a linear regression of the BSA standard.

ssRNA and DNA oligonucleotides with DNA handles (Integrated DNATechnologies) were annealed in 1× duplex buffer (HEPES 30 mM pH 7.5,K+Acetate 100 mM) at 85° C. for 5 minutes with a slow ramp to 4° C.,then purified using Oligo Clean & Concentrator (Zymo), quantified with aNanodrop, and normalized to 100 ng/μL.

In vitro assays were performed as previously described (23) with slightmodifications. Assays were set up on ice with 25 nM RNA substrate, 50 nMADAR protein, and 0.16 U/uL RNase inhibitor, and 15.6 mM NaCl in 1×assay buffer (17 mM TrisHCl pH 7.5, 5% glycerol, 1.6 mM EDTA, 0.003%NP-40, 0.5 mM TCEP). 20 μL reactions (with three technical replicates)were incubated at 30° C. for a range of timepoints (0, 5, 10, 30, and 60minutes). Reactions were quenched by the addition of 10 uL of 0.5% SDSsolution (to a total concentration of 0.166% SDS), and denatured for 5minutes at 95° C.

RNA was purified from the reaction mixture using RNA XP clean beads(Beckman Coulter) with 10:3 and 3:1 ratios of magnetic beads andisopropanol to sample volume, respectively. Purified RNA was reversetranscribed using the qScript Flex cDNA kit according to manufacturerspecifications with modifications. Specifically, 12.85 μL of purifiedRNA was combined with 2 μL of GSP enhancer and 0.15 μL of 100 μM RTprimer, mixed by vortexing and incubated at 65° C. for 5 minutes beforeentering a 42° C. hold. At this point 4 μL of qScript flex reactionmastermix (5×) and 1 μL of qScript RT were added to each reaction andmixed by pipetting followed by a one hour incubation at 42° C., thenheating at 85° C. for 5 minutes. The cDNA was prepared for sequencingwith two rounds of PCR amplification to add Illumina adaptors andbarcodes and was sequenced on an Illumina NextSeq. Rates of in vitro RNAediting were determined at all cytidines (for C-to-U activity) andadenosines (for A-to-I activity) within the sequencing window.

Whole-Transcriptome Sequencing to Evaluate ADAR Editing Specificity

For analyzing off-target RNA editing sites across the transcriptome,total RNA from cells was harvested 48 hours post-transfection using theRNeasy Plus Miniprep kit (Qiagen). The mRNA fraction was then enrichedusing a NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB) and thisRNA was then prepared for sequencing using an NEBNext Ultra RNA Library

Prep Kit for Illumina (NEB). The libraries were then sequenced on anIllumina NextSeq and loaded such that there were at least 5 millionreads per sample.

RNA Editing Analysis for Targeted and Transcriptome-Wide Experiments

Analysis of the transcriptome-wide editing RNA sequencing data wasperformed on the FireCloud computational framework(software.broadinstitute.org/firecloud/) using a custom workflowdeveloped for this publication:portal.firecloud.org/#methods/m/rna_editing_final_workflow/rna_editing_final_workflow/l.

For analysis, unless otherwise denoted, sequence files were randomlydown sampled to 5 million reads. An index was generated using the RefSeqGRCh38 assembly with Gluc and Cluc sequences added, and reads werealigned and quantified using Bowtie/RSEM version 1.3.0. Alignment BAMswere then sorted and analyzed for RNA editing sites using REDitools (35,36) with the following parameters: -t 8 -e -d -l -U [AG or TC or CT orGA]-p -u -m20 -T6-0 -W -v l-n 0.0. Any significant edits found inuntransfected or EGFP-transfected conditions were considered to be SNPsor artifacts of the transfection and filtered out from the analysis ofoff-targets. Off-targets were considered significant if the Fisher'sexact test yielded a p-value less than 0.05 after multiple hypothesiscorrection by Benjamini Hochberg correction and at least 2 of 3biological replicates identified the edit site. Overlap of edits betweensamples was calculated relative to the maximum possible overlap,equivalent to the fewer number of edits between the two samples. Thepercentage of overlapping edit sites was calculated as the number ofshared edit sites divided by minimum number of edits of the two samples,multiplied by 100. An additional layer of filtering for known SNPpositions was performed using the Kaviar (37) method for identifyingSNPs.

Differential Gene Expression Analysis of RNA Editing

Bowtie index was created based on the human hg38 UCSC genome and RefSeqtranscriptome. Next, RSEM v1.3.157 was run with command line options“--estimate-rspd --bowtie-chunkmbs 512 --paired-end” to align paired-endreads directly to this index using Bowtie and estimate expression levelsin transcripts per million (TPM) based on the alignments. For analysisof transcriptome changes, transcripts were considered detected if theaverage TPM of either the RESCUE or GFP control conditions was greaterthan 1. The Student's t-test was performed to identify differentiallyexpressed isoforms that had p-value pass 0.01 FDR correction.

Stat Phenotype Assay

Cells were transfected with RESCUE plasmids, guide plasmids targetingresidues on STAT3 and STAT1, and a luciferase reporter for STAT3 (QiagenCignal STAT3 Reporter) and STAT1 signaling (Qiagen Cignal GAS Reporter)using lipofectamine 2000, as described above and incubated for 48 hours.After 48 hours, the Dual-Glo Luciferase Assay (Promega) was used tomeasure firefly and renilla luciferase activity in the cells. Thefirefly signal was normalized to the renilla signal to measure therelative activation of STAT3 and STAT1.

β-Catenin Phenotype Assay

Cells were plated 24 hours prior to transfection in cell migrationplates containing cores that prevent cell growth in the center of thewell. After 24 hours, cells were transfected with RESCUE plasmids, guideplasmids targeting residues on β-catenin, and a luciferase reporter forβ-catenin activation (Qiagen TCF/LEF Cignal Reporter) usinglipofectamine 2000, as described above and incubated. After 24 hours,central cores were removed to allow for cell growth towards the centerof the well. After another 24 hours of incubation, media was assayed forGluc and Cluc luciferase signal. The relative ratio of Gluc to Cluc wascalculated to determine the relative β-catenin activation betweenconditions. On day 3 cells were incubated for 10 minutes withCellTracker Green CMFDA Dye (ThermoFisher Scientific) and then washedwith media. Cells were imaged daily using fluorescence to measure cellgrowth. Cell growth into the central area of the well was measured usingImageJ software by calculating the total area of fluorescence in thecentral growth region. Images were processed using an automated macrowith the following commands:

//ImageJ macro for calculating cellular area run(“8-bit”);

run(“Auto Local Threshold”, “method=Bernsen radius=15 parameter 1=0parameter 2=0 white”);

setAutoThreshold(“Default dark”); run(“Measure”);

Catenin Migration Assay (HUVECs)

HUVECs were plated on Collagen I-coated cell migration plates 16 hoursprior to transfection. 100 ng of a single vector, containing both theRESCUE construct and guide, were used in the transfection protocoldescribed above. After 24 hours, central cores were removed and mediawas replaced with Endothelial Basal Media-2 (Lonza) supplemented withhydrocortisone, hFGF-B, FBS, ascorbic acid, GA-1000, and heparin fromEGM-2 Supplement Pack (Lonza). On day 3, cells were incubated for 10minutes with CellTracker Green CMFDA Dye diluted in EBM-2 and thenwashed with media. Cells were imaged daily using fluorescence. Cellgrowth was measured using ImageJ software by manually outlining andquantifying the cell-free area in each well.

TABLE 27 RESCUE evolution table Screening method for RESCUE generationcandidate of candidate Mammalian round Mutations mutations editingtarget r0 ADAR2 + E488Q    Hyper active N/A (REPAIR) variant from Kuttanand Bass r1  r0 + V351G Rational CCG site on mutagenesis Gluc transcript(C82R) r2  r1 + S486A Rational CCG site on mutagenesis Gluc transcript(C82R) r3  r2 + T375S Rational CCG site on mutagenesis Gluc transcript(C82R) r4  r3 + S370C Y66H EGFP CCG site on Gluc transcript (C82R) r5 r4 + P462A P196L HIS CCG site on Gluc transcript (C82R) r6 r5 + N597IP196L HIS CCG site on Gluc transcript (C82R) r7 r6 + L332I  P196L HISCCG site on Gluc transcript (C82R) r8 r7 + I398V P196L HIS CCG site onGluc transcript (C82R) r9 r8 + K350I P196L HIS CCG site on Gluctranscript (C82R) r10  r9 + M383L P196L HIS CCG site on Gluc transcript(C82R) r11 r10 + D619G  S22P HIS CCG site on Gluc transcript (C82R) r12r11 + S582T  S22P HIS T41I on endogenous β-catenin r13 r12 + V440I  S22PHIS T41I on endogenous β-catenin r14 r13 + S495N  P196L HIS CCT site onGluc transcript (L77P) r15 r14 + K418E  P196L HIS CCT site on Gluctranscript (L77P) r16 r15 + S661T  S22P HIS UCG, ACG, (RESCUE) GCG, CGCsites on Gluc transcript (C82R), CCT and CCA sites on Gluc transcript(L77P), and T41I on endogenous β-catenin

TABLE 28 Disease information for disease-relevant mutations CandidateGene Diseases NM_000071.2(CBS): CBS Thoracic aortic aneurysm c.325T > C(p.Cys109Arg) and aortic dissection NM_000141.4(FGFR2): FGFR2 Pfeiffersyndrome/Crouzon c.799T > C (p.Ser267Pro) syndrome/Neoplasm of stomachNM_000551.3(VHL): VHL Von Hippel-Lindau syndrome c.473T > C(p.Leu158Pro) NM_002474.2(MYH11): MYH11 Aortic aneurysm, familialc.3791T > C thoracic 4/Thoracic aortic (p.Leu1264Pro) aneurysm andaortic dissection NM_000018.3(ACADVL): ACADVL Very long chain acyl-CoAc.848T > C (p.Val283Ala) dehydrogenase deficiency NM_002397.4(MEF2C):MEF2C Mental retardation, c.2T > C (p.Met1Thr) stereotypic movements,epilepsy, and/or cerebral malformations NM_002834.4(PTPN11): PTPN11Noonan syndrome c.853T > C (p.Phe285Leu) NM_005609.3(PYGM): PYGMGlycogen storage disease, c.2392T > C (p.Trp798Arg) type VNM_001256850.1(TTN): TTN Limb-girdle muscular c.90211T > C dystrophy,type 2J/Distal (p.Cys30071Arg) myopathy Markesbery-Griggstype/Hereditary myopathy with early respiratory failure/ Myopathy,early-onset, with fatal cardiomyopathy/Familial hypertrophiccardiomyopathy 9 NM_005633.3(SOS1): SOS1 Noonan syndrome 4/Noonanc.806T > C (p.Met269Thr) syndrome NM_015559.2(SETBP1): SETBP1Schinzel-Giedion syndrome c.2612T > C (p.Ile871Thr) NM_004572.3(PKP2):PKP2 Arrhythmogenic right c.2386T > C ventricular cardiomyopathy,(p.Cys796Arg) type 9 NM_000138.4(FBN1): FBN1 Marfan syndrome c.4222T > C(p.Cys1408Arg) NM_000375.2(UROS): UROS Congenital erythropoieticc.217T > C (p.Cys73Arg) porphyria NM_O14139.2(SCN11A): SCN11A notprovided/Neuropathy, c.1187T > C (p.Leu396Pro) hereditary sensory andautonomic, type VII NM_000152.4(GAA): GAA Glycogen storage disease,c.1655T > C (p.Leu552Pro) type II NM_020630.4(RET): RET Multipleendocrine neoplasia, c.1858T > C type 2a/Multiple endocrine(p.Cys620Arg) neoplasia, type 2/MEN2A and FMTC NM_000016.5(ACADM): ACADMMedium-chain acyl-coenzyme A c.199T > C (p.Tyr67His) dehydrogenasedeficiency NM_014874.3(MFN2): MFN2 Charcot-Marie-Tooth disease, c.227T >C (p.Leu76Pro) type 2A2A NM_000341.3(SLC3A1): SLC3A1 Cystinuriac.1400T > C (p.Met467Thr) NM_000431.3 (MVK): MVK Mevalonicaciduria/Hyper- c.803T > C (p.Ile268Thr) immunoglobulin D with periodicfever NM_004004.5(GJB2): GJB2 Deafness, autosomal recessive c.229T > C(p.Trp77Arg) 1A/Deafness, autosomal dominant 3a/Nonsyndromic hearingloss and deafness NM_000041.4(APOE): APOE Alzheimer disease 2 c.388T > C(p.Cys130Arg) NM_000041.4(APOE): APOE Alzheimer disease 2 c.595T > C(p.Cys176Arg)

TABLE 29 Guide sequences used for luciferase editing Editing TargetedREPAIR/ Base flip/ Codon percentage Name gene RESCUE Motif positionchange Spacer sequence Notes (first figure) First figure UCG targetingGluc RESCUE UCG C/30/26 C82R gugcCauugaugugggacaggca No 5′ G 67.493 103Bguide gaucaga (SEQ ID NO: 780) GCG targeting Gluc RESCUE GCG U/30/20C82R guugggcgugcucuugauguggg 45.475 103B guide acaggcag (SEQ ID NO: 781)ACG targeting Gluc RESCUE ACG C/30/28 C82R ggccuuugaugugggacaggcag64.464 103B guide aucagaca (SEQ ID NO: 782) CCG targeting Gluc RESCUECCG C/30/26 C82R gugccguugaugugggacaggca No 5′ G 62.947 103B guidegaucaga (SEQ ID NO: 783) CCU targeting Gluc RESCUE CCU C/30/26 L77Pgggaacggcagaucagacagccc 3.800 103B cuggugca (SEQ ID NO: 784)CCA targeting Gluc RESCUE CCA C/30/26 L77P gggauuggcagaucagacagccc 4.509103B cuggugca (SEQ ID NO: 785) Motif guide Gluc RESCUE UCU U/30/26 L82FgugaUauugaugugggacaggca 46.611 103C UCU, flip U gaucaga (SEQ ID NO: 786)Motif guide Gluc RESCUE UCG U/30/26 C82R gugcUauugaugugggacaggca 57.945103C UCG, flip U gaucaga (SEQ ID NO: 787) Motif guide Gluc RESCUE UCCU/30/26 P82S guggUauugaugugggacaggca 57.165 103C UCC, flip Ugaucaga (SEQ ID NO: 788) Motif guide Gluc RESCUE UCA U/30/26 H82YguguUauugaugugggacaggca 49.256 103C UCA, flip U gaucaga (SEQ ID NO: 789)Motif guide Gluc RESCUE ACU U/30/26 L82F gugaUuuugaugugggacaggca 44.241103C ACU, flip U gaucaga (SEQ ID NO: 790) Motif guide Gluc RESCUE ACGU/30/26 C82R gugcUuuugaugugggacaggca 60.722 103C ACG, flip Ugaucaga (SEQ ID NO: 791) Motif guide Gluc RESCUE ACC U/30/26 P82SguggUuuugaugugggacaggca 58.056 103C ACC, flip U gaucaga (SEQ ID NO: 792)Motif guide Gluc RESCUE ACA U/30/26 H82Y guguUuuugaugugggacaggca 40.921103C ACA, flip U gaucaga (SEQ ID NO: 793) Motif guide Gluc RESCUE GCUU/30/26 L82F gugaUcuugaugugggacaggca 4.603 103C GCU, flip Ugaucaga (SEQ ID NO: 794) Motif guide Gluc RESCUE GCG U/30/26 C82RgugcUcuugaugugggacaggca 43.507 103C GCG, flip U gaucaga (SEQ ID NO: 795)Motif guide Gluc RESCUE GCC U/30/26 P82S guggUcuugaugugggacaggca 11.006103C GCC, flip U gaucaga (SEQ ID NO: 796) Motif guide Gluc RESCUE GCAU/30/26 H82Y guguUcuugaugugggacaggca 4.239 103C GCA, flip Ugaucaga (SEQ ID NO: 797) Motif guide Gluc RESCUE CCU U/30/26 L82FgugaUguugaugugggacaggca 11.808 103C CCU, flip U gaucaga (SEQ ID NO: 798)Motif guide Gluc RESCUE CCG U/30/26 C82R gugcUguugaugugggacaggca 51.692103C CCG, flip U gaucaga (SEQ ID NO: 799) Motif guide Gluc RESCUE CCCU/30/26 P82S guggUguugaugugggacaggca 28.402 103C CCC, flip Ugaucaga (SEQ ID NO: 800) Motif guide Gluc RESCUE CCA U/30/26 H82YguguUguugaugugggacaggca 7.597 103C CCA, flip U gaucaga (SEQ ID NO: 801)Motif guide Gluc RESCUE UCU C/30/26 L82F gugaCauugaugugggacaggca 49.430103C UCU, flip C gaucaga (SEQ ID NO: 802) Motif guide Gluc RESCUE UCCC/30/26 P82S guggCauugaugugggacaggca 59.973 103C UCC, flip Cgaucaga (SEQ ID NO: 803) Motif guide Gluc RESCUE UCA C/30/26 H82YguguCauugaugugggacaggca 48.343 103C UCA, flip C gaucaga (SEQ ID NO: 804)Motif guide Gluc RESCUE ACU C/30/26 L82F gugaCuuugaugugggacaggca 47.840103C ACU, flip C gaucaga (SEQ ID NO: 805) Motif guide Gluc RESCUE ACGC/30/26 C82R gugcCuuugaugugggacaggca 70.120 103C ACG, flip C(SEQ ID NO: 806) Motif guide Gluc RESCUE ACC C/30/26 P82SguggCuuugaugugggacaggca 58.779 103C ACC, flip C gaucaga (SEQ ID NO: 807)Motif guide Gluc RESCUE ACA C/30/26 H82Y guguCuuugaugugggacaggca 45.594103C ACA, flip C gaucaga (SEQ ID NO: 808) Motif guide Gluc RESCUE GCUC/30/26 L82F gugaCcuugaugugggacaggca 3.652 103C GCU, flip Cgaucaga (SEQ ID NO: 809) Motif guide Gluc RESCUE GCG C/30/26 C82RgugcCcuugaugugggacaggca 37.719 103C GCG, flip C gaucaga (SEQ ID NO: 810)Motif guide Gluc RESCUE GCC C/30/26 P82S guggCcuugaugugggacaggca 34.488103C GCC, flip C gaucaga (SEQ ID NO: 811) Motif guide Gluc RESCUE GCAC/30/26 H82Y guguCcuugaugugggacaggca 2.944 103C GCA, flip Cgaucaga (SEQ ID NO: 812) Motif guide Gluc RESCUE CCU C/30/26 L82FgugaCguugaugugggacaggca 16.647 103C CCU, flip C gaucaga (SEQ ID NO: 813)Motif guide Gluc RESCUE CCC C/30/26 P82S guggCguugaugugggacaggca 48.269103C CCC, flip C gaucaga (SEQ ID NO: 814) Motif guide Gluc RESCUE CCAC/30/26 H82Y guguCguugaugugggacaggca 12.670 103C CCA, flip Cgaucaga (SEQ ID NO: 815) Non-targeting N/A N/A N/A N/A N/Aguaaugccuggcuugucgacgca N/A 104C guide uagucug (SEQ ID NO: 816)Gluc specificity Gluc RESCUE UCG C/30/26 C82R ggugcuaGugaugugggacagcAdditional G 10.836 105D guide with off- agaucaga (SEQ ID NO: 817) addedtarget A-G specificity mismatch 1 Gluc specificity Gluc RESCUE UCGC/30/26 C82R ggugcuauGgaugugggacaggc Additional G 16.037 105Dguide with off- agaucaga (SEQ ID NO: 818) added target A-G specificitymismatch 2 Gluc specificity Gluc RESCUE UCG C/30/26 C82RggugcuauugaugGgggacaggc Additional G 29.555 105D guide with off-agaucaga (SEQ ID NO: 819) added target A-G specificity mismatch 3Gluc specificity Gluc RESCUE UCG C/30/26 C82R ggugcuaGGgaugugggacaggcAdditional G 1.533 105D guide with off- agaucaga (SEQ ID NO: 820) addedtarget A-G specificity combo 1 + 2 Gluc specificity Gluc RESCUE UCGC/30/26 C82R ggugcuaGGgaugGgggacaggc Additional G 0.412 105Dguide with off- agaucaga (SEQ ID NO: 821) added target A-G specificitycombo all A to I REPAIR Cluc REPAIR TAG C/50/34 *85Wgcgcccugugcggacuccuuguc N/A 106A guide gccuucguagguguggcagcguccuggg (SEQ ID NO: 822) Tiling guide 30 Gluc RESCUE UCG U/30/30 C82Rguauugaugugggacaggcagau 6.327 115 flip 30 U cagacagc (SEQ ID NO: 823)Tiling guide 30 Gluc RESCUE UCG U/30/28 C82R ggcuauugaugugggacaggcag45.029 115 flip 28 U aucagaca (SEQ ID NO: 824) Tiling guide 30 GlucRESCUE UCG U/30/26 C82R ggugcuauugaugugggacaggc 54.433 115 flip 26 Uagaucaga (SEQ ID NO: 825) Tiling guide 30 Gluc RESCUE UCG U/30/24 C82Rggcgugcuauugaugugggacag 51.454 115 flip 24 U gcagauca (SEQ ID NO: 826)Tiling guide 30 Gluc RESCUE UCG U/30/22 C82R ggggcgugcuauugaugugggac28.375 115 flip 22 U aggcagau (SEQ ID NO: 827) Tiling guide 30 GlucRESCUE UCG U/30/20 C82R guugggcgugcuauugauguggg 34.460 115 flip 20 Uacaggcag (SEQ ID NO: 828) Tiling guide 30 Gluc RESCUE UCG U/30/18 C82Rgucuugggcgugcuauugaugug 24.148 115 flip 18 U ggacaggc (SEQ ID NO: 829)Tiling guide 30 Gluc RESCUE UCG U/30/16 C82R gcaucuugggcgugcuauugaug12.372 115 flip 16 U ugggacag (SEQ ID NO: 830) Tiling guide 30 GlucRESCUE UCG U/30/14 C82R guucaucuugggcgugcuauuga 2.008 115 flip 14 Uugugggac (SEQ ID NO: 831) Tiling guide 30 Gluc RESCUE UCG U/30/12 C82Rgucuucaucuugggcgugcuauu 4.807 115 flip 12 U gauguggg (SEQ ID NO: 832)Tiling guide 30 Gluc RESCUE UCG U/30/10 C82R gcuucuucaucuugggcgugcua6.679 115 flip 10 U (SEQ ID NO: 833) Tiling guide 30 Gluc RESCUE UCGU/30/8 C82R gaacuucuucaucuugggcgugc 0.930 115 flip 8 Uuauugaug (SEQ ID NO: 834) Tiling guide 30 Gluc RESCUE UCG U/30/6 C82Rgugaacuucuucaucuugggcgu 22.763 115 flip 6 U gcuauuga (SEQ ID NO: 835)Tiling guide 30 Gluc RESCUE UCG U/30/4 C82R ggaugaacuucuucaucuugggc0.633 115 flip 4 U gugcuauu (SEQ ID NO: 836) Tiling guide 30 Gluc RESCUEUCG U/30/2 C82R ggggaugaacuucuucaucuugg 0.266 115 flip 2 Ugcgugcua (SEQ ID NO: 837) Tiling guide 50 Gluc RESCUE UCG U/50/50 C82Rguauugaugugggacaggcagau 1.263 115 flip 50 U cagacagccccuggugcagccagcuuuc (SEQ ID NO: 838) Tiling guide 50 Gluc RESCUE UCG U/50/48 C82Rggcuauugaugugggacaggcag 24.879 115 flip 48 U aucagacagccccuggugcagccagcuu (SEQ ID NO: 839) Tiling guide 50 Gluc RESCUE UCG U/50/46 C82Rggugcuauugaugugggacaggc 21.993 115 flip 46 U agaucagacagccccuggugcagccagc (SEQ ID NO: 840) Tiling guide 50 Gluc RESCUE UCG U/50/44 C82Rggcgugcuauugaugugggacag 25.736 115 flip 44 U gcagaucagacagccccuggugcagcca (SEQ ID NO: 841) Tiling guide 50 Gluc RESCUE UCG U/50/42 C82Rggggcgugcuauugaugugggac 27.579 115 flip 42 U aggcagaucagacagccccuggugcagc (SEQ ID NO: 842) Tiling guide 50 Gluc RESCUE UCG U/50/40 C82Rguugggcgugcuauugauguggg 27.775 115 flip 40 U acaggcagaucagacagccccuggugca (SEQ ID NO: 843) Tiling guide 50 Gluc RESCUE UCG U/50/38 C82Rgucuugggcgugcuauugaugug 13.260 115 flip 38 U ggacaggcagaucagacagccccuggug (SEQ ID NO: 844) Tiling guide 50 Gluc RESCUE UCG U/50/36 C82Rgcaucuugggcgugcuauugaug 9.892 115 flip 36 U ugggacaggcagaucagacagccccugg (SEQ ID NO: 845) Tiling guide 50 Gluc RESCUE UCG U/50/34 C82Rguucaucuugggcgugcuauuga 19.186 115 flip 34 U ugugggacaggcagaucagacagccccu (SEQ ID NO: 846) Tiling guide 50 Gluc RESCUE UCG U/50/32 C82Rgucuucaucuugggcgugcuauu 22.932 115 flip 32 U gaugugggacaggcagaucagacagccc (SEQ ID NO: 847) Tiling guide 50 Gluc RESCUE UCG U/50/30 C82Rgcuucuucaucuugggcgugcua 12.794 115 flip 30 U uugaugugggacaggcagaucagacagc (SEQ ID NO: 848) Tiling guide 50 Gluc RESCUE UCG U/50/28 C82Rgaacuucuucaucuugggcgugc 33.367 115 flip 28 U uauugaugugggacaggcagaucagaca (SEQ ID NO: 849) Tiling guide 50 Gluc RESCUE UCG U/50/26 C82Rgugaacuucuucaucuugggcgu 32.651 115 flip 26 U gcuauugaugugggacaggcagaucaga (SEQ ID NO: 850) Tiling guide 50 Gluc RESCUE UCG U/50/24 C82Rggaugaacuucuucaucuugggc 22.201 115 flip 24 U gugcuauugaugugggacaggcagauca (SEQ ID NO: 851) Tiling guide 50 Gluc RESCUE UCG U/50/22 C82Rggggaugaacuucuucaucuugg 12.607 115 flip 22 U gcgugcuauugaugugggacaggcagau (SEQ ID NO: 852) Tiling guide 50 Gluc RESCUE UCG U/50/20 C82Rgcugggaugaacuucuucaucuu 17.727 115 flip 20 U gggcgugcuauugaugugggacaggcag (SEQ ID NO: 853) Tiling guide 50 Gluc RESCUE UCG U/50/18 C82Rguccugggaugaacuucuucauc 11.842 115 flip 18 U uugggcgugcuauugaugugggacaggc (SEQ ID NO: 854) Tiling guide 50 Gluc RESCUE UCG U/50/16 C82Rgcguccugggaugaacuucuuca 9.368 115 flip 16 U ucuugggcgugcuauugaugugggacag (SEQ ID NO: 855) Tiling guide 50 Gluc RESCUE UCG U/50/14 C82Rgagcguccugggaugaacuucuu 2.637 115 flip 14 U caucuugggcgugcuauugaugugggac (SEQ ID NO: 856) Tiling guide 50 Gluc RESCUE UCG U/50/12 C82Rggcagcguccugggaugaacuuc 35.033 115 flip 12 U uucaucuugggcgugcuauugauguggg (SEQ ID NO: 857) Tiling guide 50 Gluc RESCUE UCG U/50/10 C82Rguggcagcguccugggaugaacu 10.675 115 flip 10 U ucuucaucuugggcgugcuauugaugug (SEQ ID NO: 858) Tiling guide 50 Gluc RESCUE UCG U/50/8 C82Rguguggcagcguccugggaugaa 1.730 115 flip 8 U cuucuucaucuugggcgugcuauugaug (SEQ ID NO: 859) Tiling guide 50 Gluc RESCUE UCG U/50/6 C82Rggguguggcagcguccugggaug 2.249 115 flip 6 U aacuucuucaucuugggcgugcuauuga (SEQ ID NO: 860) Tiling guide 50 Gluc RESCUE UCG U/50/4 C82Rguagguguggcagcguccuggga 0.438 115 flip 4 U ugaacuucuucaucuugggcgugcuauu (SEQ ID NO: 861) Tiling guide 50 Gluc RESCUE UCG U/50/2 C82Rgcguagguguggcagcguccugg 0.293 115 flip 2 U gaugaacuucuucaucuugggcgugcua (SEQ ID NO: 862) Motif guide Gluc RESCUE UCU A/30/26 L82FgugaAauugaugugggacaggca 0.084 114A- UCU, flip A gaucaga (SEQ ID NO: 863)114C Motif guide Gluc RESCUE UCG A/30/26 C82R gugcAauugaugugggacaggca0.399 114A- UCG, flip A gaucaga (SEQ ID NO: 864) 114C Motif guide GlucRESCUE UCC A/30/26 P82S guggAauugaugugggacaggca 0.210 114A- UCC, flip Agaucaga (SEQ ID NO: 865) 114C Motif guide Gluc RESCUE UCA A/30/26 H82YguguAauugaugugggacaggca 0.077 114A- UCA, flip A gaucaga (SEQ ID NO: 866)114C Motif guide Gluc RESCUE ACU A/30/26 L82F gugaAuuugaugugggacaggca0.097 114A- ACU, flip A gaucaga (SEQ ID NO: 867) 114C Motif guide GlucRESCUE ACG A/30/26 C82R gugcAuuugaugugggacaggca 0.399 114A- ACG, flip Agaucaga (SEQ ID NO: 868) 114C Motif guide Gluc RESCUE ACC A/30/26 P82SguggAuuugaugugggacaggca 0.138 114A- ACC, flip A gaucaga (SEQ ID NO: 869)114C Motif guide Gluc RESCUE ACA A/30/26 H82Y guguAuuugaugugggacaggca0.036 114A- ACA, flip A gaucaga (SEQ ID NO: 870) 114C Motif guide GlucRESCUE GCU A/30/26 L82F gugaAcuugaugugggacaggca 0.057 114A- GCU, flip Agaucaga (SEQ ID NO: 871) 114C Motif guide Gluc RESCUE GCG A/30/26 C82RgugcAcuugaugugggacaggca 0.029 114A- GCG, flip A gaucaga (SEQ ID NO: 872)114C Motif guide Gluc RESCUE GCC A/30/26 P82S guggAcuugaugugggacaggca0.023 114A- GCC, flip A gaucaga (SEQ ID NO: 873) 114C Motif guide GlucRESCUE GCA A/30/26 H82Y guguAcuugaugugggacaggca 0.022 114A-114CGCA, flip A gaucaga (SEQ ID NO: 874) Motif guide Gluc RESCUE CCU A/30/26L82F gugaAguugaugugggacaggca 0.055 114A-114C CCU, flip Agaucaga (SEQ ID NO: 875) Motif guide Gluc RESCUE CCG A/30/26 C82RgugcAguugaugugggacaggca 0.066 114A-114C CCG, flip Agaucaga (SEQ ID NO: 876) Motif guide Gluc RESCUE CCC A/30/26 P82SguggAguugaugugggacaggca 0.016 114A-114C CCC, flip Agaucaga (SEQ ID NO: 877) Motif guide Gluc RESCUE CCA A/30/26 H82YguguAguugaugugggacaggca 0.022 114A-114C CCA, flip Agaucaga (SEQ ID NO: 878) Motif guide Gluc RESCUE UCU G/30/26 L82FgugaGauugaugugggacaggca 0.058 114A-114C UCU, flip Ggaucaga (SEQ ID NO: 879) Motif guide Gluc RESCUE UCG G/30/26 C82RgugcGauugaugugggacaggca 0.094 114A-114C UCG, flip Ggaucaga (SEQ ID NO: 880) Motif guide Gluc RESCUE UCC G/30/26 P82SguggGauugaugugggacaggca 0.022 114A-114C UCC, flip Ggaucaga (SEQ ID NO: 881) Motif guide Gluc RESCUE UCA G/30/26 H82YguguGauugaugugggacaggca 0.026 114A-114C UCA, flip Ggaucaga (SEQ ID NO: 882) Motif guide Gluc RESCUE ACU G/30/26 L82FgugaGuuugaugugggacaggca 0.053 114A-114C ACU, flip Ggaucaga (SEQ ID NO: 883) Motif guide Gluc RESCUE ACG G/30/26 C82RgugcGuuugaugugggacaggca 0.035 114A-114C ACG, flip Ggaucaga (SEQ ID NO: 884) Motif guide Gluc RESCUE ACC G/30/26 P82SguggGuuugaugugggacaggca 0.017 114A-114C ACC, flip Ggaucaga (SEQ ID NO: 885) Motif guide Gluc RESCUE ACA G/30/26 H82YguguGuuugaugugggacaggca 0.030 114A-114C ACA, flip Ggaucaga (SEQ ID NO: 886) Motif guide Gluc RESCUE GCU G/30/26 L82FgugaGcuugaugugggacaggca 0.053 114A-114C GCU, flip Ggaucaga (SEQ ID NO: 887) Motif guide Gluc RESCUE GCG G/30/26 C82RgugcGcuugaugugggacaggca 0.026 114A-114C GCG, flip Ggaucaga (SEQ ID NO: 888) Motif guide Gluc RESCUE GCC G/30/26 P82SguggGcuugaugugggacaggca 0.018 114A-114C GCC, flip Ggaucaga (SEQ ID NO: 889) Motif guide Gluc RESCUE GCA G/30/26 H82YguguGcuugaugugggacaggca 0.018 114A-114C GCA, flip Ggaucaga (SEQ ID NO: 890) Motif guide Gluc RESCUE CCU G/30/26 L82FgugaGguugaugugggacaggca 0.049 114A-114C CCU, flip Ggaucaga (SEQ ID NO: 891) Motif guide Gluc RESCUE CCG G/30/26 C82RgugcGguugaugugggacaggca 0.064 114A-114C CCG, flip Ggaucaga (SEQ ID NO: 892) Motif guide Gluc RESCUE CCC G/30/26 P82SguggGguugaugugggacaggca 0.011 114A-114C CCC, flip Ggaucaga (SEQ ID NO: 893) Motif guide Gluc RESCUE CCA G/30/26 H82YguguGguugaugugggacaggca 0.017 114A-114C CCA, flip Ggaucaga (SEQ ID NO: 894) Tiling guide 30 Gluc RESCUE UCG C/30/30 C82Rguguugaugugggacaggcagau 0.164 122A-12224C flip 30 Ccagacagc (SEQ ID NO: 895) Tiling guide 30 Gluc RESCUE UCG C/30/28 C82Rggcuguugaugugggacaggcag 37.368 122A-122C flip 28 Caucagaca (SEQ ID NO: 896) Tiling guide 30 Gluc RESCUE UCG C/30/26 C82Rggugcuguugaugugggacaggc 44.775 122A-122C flip 26 Cagaucaga (SEQ ID NO: 897) Tiling guide 30 Gluc RESCUE UCG C/30/24 C82Rggcgugcuguugaugugggacag 26.988 122A-122C flip 24 Cgcagauca (SEQ ID NO: 898) Tiling guide 30 Gluc RESCUE UCG C/30/22 C82Rggggcgugcuguugaugugggac 16.710 122A-122C flip 22 Caggcagau (SEQ ID NO: 899) Tiling guide 30 Gluc RESCUE UCG C/30/20 C82Rguugggcgugcuguugauguggg 29.288 122A-122C flip 20 Cacaggcag (SEQ ID NO: 900) Tiling guide 30 Gluc RESCUE UCG C/30/18 C82Rgucuugggcgugcuguugaugug 18.125 122A-122C flip 18 Cggacaggc (SEQ ID NO: 901) Tiling guide 30 Gluc RESCUE UCG C/30/16 C82Rgcaucuugggcgugcuguugaug 1.532 122A-122C flip 16 Cugggacag (SEQ ID NO: 902) Tiling guide 30 Gluc RESCUE UCG C/30/14 C82Rguucaucuugggcgugcuguuga 0.184 122A-122C flip 14 Cugugggac (SEQ ID NO: 903) Tiling guide 30 Gluc RESCUE UCG C/30/12 C82Rgucuucaucuugggcgugcuguu 0.341 122A-122C flip 12 Cgauguggg (SEQ ID NO: 904) Tiling guide 30 Gluc RESCUE UCG C/30/10 C82Rgcuucuucaucuugggcgugcug 0.275 122A-122C flip 10 Cuugaugug (SEQ ID NO: 905) Tiling guide 30 Gluc RESCUE UCG C/30/8 C82Rgaacuucuucaucuugggcgugc 0.054 122A-122C flip 8 Cuguugaug (SEQ ID NO: 906) Tiling guide 30 Gluc RESCUE UCG C/30/6 C82Rgugaacuucuucaucuugggcgu 0.437 122A-122C flip 6 Cgcuguuga (SEQ ID NO: 907) Tiling guide 30 Gluc RESCUE UCG C/30/4 C82Rggaugaacuucuucaucuugggc 0.226 122A-122C flip 4 Cgugcuguu (SEQ ID NO: 908) Tiling guide 30 Gluc RESCUE UCG C/30/2 C82Rggggaugaacuucuucaucuugg 0.040 122A-122C flip 2 Cgcgugcug (SEQ ID NO: 909)

TABLE 30 Guide sequences used for endogenous gene editing Editing Basepercent- Target- flip/ age First ed REPAIR/ posi- Codon (first fig- Namegene RESCUE Motif tion change Spacer sequence figure) ureS33F_CTNNB1_30bp_guide_30_9 CTNNB1 RESCUE UCU C/22 S33FGGGAUUCCACAGUCCAGGU 11.245 103E C flip AAGACUGUUGCU (SEQ ID NO: 910)H36Y_CTNNB1_30bp_guide_30_9 CTNNB1 RESCUE CCA U/22 H36YGACCAGAAUUGAUUCCAGA 2.995 103E U flip GUCCAGGUAAGA (SEQ ID NO: 911)S37F_CTNNB1_30bp_guide_30_9 CTNNB1 RESCUE UCU U/22 S37FGUGGCACCAUAAUGGAUUC 17.616 103E U flip CAGAGUCCAGGU (SEQ ID NO: 912)T41I_CTNNB1_30bp_guide_30_11 CTNNB1 RESCUE ACC U/20 T41IGAGGAGCUGUGUUAGUGGC 15.711 103E U flip ACCAGAAUGGAU (SEQ ID NO: 913)P44L_CTNNB1_30bp_guide_30_9 CTNNB1 RESCUE CCU C/22 P44LGUCAGAGAACGAGCUGUGG 8.599 103E C flip UAGUGGCACCAG (SEQ ID NO: 914)P44S_CTNNB1_30bp_guide_30_11 CTNNB1 RESCUE UCU U/20 P44SGCUCAGAGAAGUAGCUGUG 22.839 103E U flip GUAGUGGCACCA (SEQ ID NO: 915)S45F_CTNNB1_30bp_guide_30_11 CTNNB1 RESCUE UCU C/20 S45FGACCACUCAGACAAGGAGC 12.412 103E C flip UGUGGUAGUGGC (SEQ ID NO: 916)TCG_KRAS_30bp_guide_30_7 U KRAS RESCUE UCG U/24 L56L GUGUGUCUAGAAUAUCCAA18.405 103E flip GAGACAGGUUUC (SEQ ID NO: 917)ACG_KRAS_30bp_guide_30_11 C KRAS RESCUE ACG C/20 D30DGGAUCAUAUUCCUCCACAA 34.013 103E flip AAUGAUUCUGAA (SEQ ID NO: 918)GCG_KRAS_30bp_guide_30_11 U KRAS RESCUE GCG U/20 G13GGUCUUGCCUACUCCACCAG 2.180 103E flip CUCCAACUACCA (SEQ ID NO: 919)CCT_KRAS_30bp_guide_30_11 C KRAS RESCUE CCU C/20 A18AGGUAUCGUCAACGCACUCU 9.465 103E flip UGCCUACGCCAC (SEQ ID NO: 920)TCG_PPIB_30bp_guide_30_11 U PPIB RESCUE UCG U/20 I18IGCGGACCCCGCUAUGAGGG 10.340 103E flip CGGCGGCAAGGA (SEQ ID NO: 921)ACG_PPIB_30bp_guide_30_7 C PPIB RESCUE ACG C/24 R7C GAUAUUCCUCCACAAAAUG34.213 103E flip AUUCUGAAUUAG (SEQ ID NO: 922)GCG_PPIB_30bp_guide_30_11 U PPIB RESCUE GCG U/20 A19VGGACGGACCCCUCGAUGAG 4.778 103E flip GGCGGCGGCAAG (SEQ ID NO: 923)CCG_PPIB_30bp_guide_30_11 C PPIB RESCUE CCG C/20 S21SGGGAAGAAGACCGACCCCG 6.101 103E flip CGAUGAGGGCGG (SEQ ID NO: 924)TCG_SMARCA4_30bp_guide_30_9 SMARCA4 RESCUE UCG U/22 S85LGGGUCGUCCUACAUGCCCU 6.807 103E U flip UCUCAUGCAUGG (SEQ ID NO: 925)ACG_SMARCA4_30bp_guide_30_11 SMARCA4 RESCUE ACG U/20 D86DGAGCGCGGGUCUUCCGACA 6.943 103E  U flip UGCCCUUCUCAU (SEQ ID NO: 926)GCG_SMARCA4_30bp_guide_30_11 SMARCA4 RESCUE GCG C/20 R89CGUGGUUGUAGCCCGGGUCG 2.277 103E C flip UCCGACAUGCCC (SEQ ID NO: 927)CCG_SMARCA4_30bp_guide_30_11 SMARCA4 RESCUE CCG U/20 P88LGGUUGUAGCGCUGGUCGUC 4.819 103E U flip CGACAUGCCCUU (SEQ ID NO: 928)NRAS_C-flip_guide_30_11 NRAS RESCUE UCC C/20 I21I GGGAUUAGCUGCAUUGUCA21.529 103E GUGCGCUUUUCC (SEQ ID NO: 929) NKFB1_U-flip_guide_30_11 NFKBRESCUE ACC U/20 P33S GGCCAUCUGUGUUUGAAAU 24.376 103EACUUCUGGAUUA (SEQ ID NO: 930) EZH2_U-flip_guide_30_11 EZH2 RESCUE UCAU/20 F32F GCAGCUCGUCUUAACCUCU 15.855 103E UGAGCUGUCUCA (SEQ ID NO: 931)NF2_U-flip_guide_30_7 NF2 RESCUE ACG U/24 T21M GGUGAACUUCUUGGGUUGC24.904 103E UUCCUCUUGAGA (SEQ ID NO: 932) RAF1_U-flip_guide_30_7 RAF1RESCUE UCC U/24 P30S GUUGUAGUAGAGAUGCAGC 20.867 103EUGGAGCCAUCAA (SEQ ID NO: 933) S33F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/24 S33F GAUUCCAUAGUCCAGGUAA 9.227 125A flip_30_7GACUGUUGCUGC (SEQ ID NO: 934) S33F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/22 S33F GGGAUUCCAUAGUCCAGGU 11.245 125A flip_30_9AAGACUGUUGCU (SEQ ID NO: 935) S33F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/20 S33F GAUGGAUUCCAUAGUCCAG 7.081 125A flip_30_11GUAAGACUGUUG (SEQ ID NO: 936) S33F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/18 S33F GGAAUGGAUUCCAUAGUCC 9.782 125A flip_30_13AGGUAAGACUGU (SEQ ID NO: 937) H36Y_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCA U/24 H36Y GCAGAAUUGAUUCCAGAGU 1.310 125A flip_30_7CCAGGUAAGACU (SEQ ID NO: 938) H36Y_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCA U/22 H36Y GACCAGAAUUGAUUCCAGA 2.995 125A flip_30_9GUCCAGGUAAGA (SEQ ID NO: 939) H36Y_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCA U/20 H36Y GGCACCAGAAUUGAUUCCA 0.918 125A flip_30_11GAGUCCAGGUAA (SEQ ID NO: 940) H36Y_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCA U/18 H36Y GUGGCACCAGAAUUGAUUC 1.061 125A flip_30_13CAGAGUCCAGGU (SEQ ID NO: 941) S37F CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/24 S37F GGCACCAUAAUGGAUUCCA 17.616 125A flip_30_7GAGUCCAGGUAA (SEQ ID NO: 942) S37F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/22 S37F GUGGCACCAUAAUGGAUUC 10.901 125A flip_30_9CAGAGUCCAGGU (SEQ ID NO: 943) S37F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/20 S37F GAGUGGCACCAUAAUGGAU 8.898 125A flip_30_11UCCAGAGUCCAG (SEQ ID NO: 944) S37F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/18 S37F GGUAGUGGCACCAUAAUGG 11.718 125A flip_30_13AUUCCAGAGUCC (SEQ ID NO: 945) T41I_CTNNB1_30bp_guide_U- CTNNB1 RESCUEACC U/24 T41I GGCUGUGUUAGUGGCACCA 4.936 125A flip_30_7GAAUGGAUUCCA (SEQ ID NO: 946) T41I_CTNNB1_30bp_guide_U- CTNNB1 RESCUEACC U/22 T41I GGAGCUGUGUUAGUGGCAC 14.554 125A flip_30_9CAGAAUGGAUUC (SEQ ID NO: 947) T41I_CTNNB1_30bp_guide_U- CTNNB1 RESCUEACC U/20 T41I GAGGAGCUGUGUUAGUGGC 14.890 125A flip_30_11ACCAGAAUGGAU (SEQ ID NO: 948) T41I_CTNNB1_30bp_guide_U- CTNNB1 RESCUEACC U/18 T41I GGAAGGAGCUGUGUUAGUG 15.711 125A flip_30_13GCACCAGAAUGG (SEQ ID NO: 949) P44L_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCU U/24 P44L GAGAGAAUGAGCUGUGGUA 3.767 125A flip_30_7GUGGCACCAGAA (SEQ ID NO: 950) P44L_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCU U/22 P44L GUCAGAGAAUGAGCUGUGG 6.569 125A flip_30_9UAGUGGCACCAG (SEQ ID NO: 951) P44L_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCU U/20 P44L GACUCAGAGAAUGAGCUGU 8.599 125A flip_30_11GGUAGUGGCACC (SEQ ID NO: 952) P44L_CTNNB1_30bp_guide_U- CTNNB1 RESCUECCU U/18 P44L GCCACUCAGAGAAUGAGCU 2.435 125A flip_30_13GUGGUAGUGGCA (SEQ ID NO: 953) P44S_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/24 P44S GGAGAAGUAGCUGUGGUAG 16.371 125A flip_30_7UGGCACCAGAAU (SEQ ID NO: 954) P44S_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/22 P44S GCAGAGAAGUAGCUGUGGU 22.090 125A flip_30_9AGUGGCACCAGA (SEQ ID NO: 955) P44S_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/20 P44S GCUCAGAGAAGUAGCUGUG 22.839 125A flip_30_11GUAGUGGCACCA (SEQ ID NO: 956) P44S_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/18 P44S GCACUCAGAGAAGUAGCUG 15.900 125A flip_30_13UGGUAGUGGCAC (SEQ ID NO: 957) S45F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/24 S45F GCUCAGAUAAGGAGCUGUG 7.049 125A flip_30_7GUAGUGGCACCA (SEQ ID NO: 958) S45F_CTNNB1_30bp_guide_U-GCACUCAGAUAAGGAGCUG flip_30_9 CTNNB1 RESCUE UCU U/22 S45FUGGUAGUGGCAC (SEQ ID 9.828 125A NO: 959) S45F_CTNNB1_30bp_guide_U-CTNNB1 RESCUE UCU U/20 S45F GACCACUCAGAUAAGGAGC 12.412 125A flip_30_11UGUGGUAGUGGC (SEQ ID NO: 960) S45F_CTNNB1_30bp_guide_U- CTNNB1 RESCUEUCU U/18 S45F GUUACCACUCAGAUAAGGA 9.093 125A flip_30_13GCUGUGGUAGUG (SEQ ID NO: 961) TCG_KRAS_30bp_guide_U- KRAS RESCUE UCGU/24 L56L GUGUGUCUAGAAUAUCCAA 18.707 125A flip_30_7 GAGACAGGUUUC (SEQ IDNO: 962) TCG_KRAS_30bp_guide_U- KRAS RESCUE UCG U/22 L56LGGCUGUGUCUAGAAUAUCC 18.405 125A flip_30_9 AAGAGACAGGUU (SEQ ID NO: 963)TCG_KRAS_30bp_guide_U- KRAS RESCUE UCG U/20 L56L GCUGCUGUGUCUAGAAUAU15.533 125A flip_30_11 CCAAGAGACAGG (SEQ ID NO: 964)TCG_KRAS_30bp_guide_U- KRAS RESCUE UCG U/18 L56L GACCUGCUGUGUCUAGAAU15.119 125A flip_30_13 AUCCAAGAGACA (SEQ ID NO: 965)ACG_KRAS_30bp_guide_U- KRAS RESCUE ACG U/24 D30D GAUAUUCUUCCACAAAAUG21.288 125A flip_30_7 AUUCUGAAUUAG (SEQ ID NO: 966)ACG_KRAS_30bp_guide_U- KRAS RESCUE ACG U/22 D30D GUCAUAUUCUUCCACAAAA24.011 125A flip_30_9 UGAUUCUGAAUU (SEQ ID NO: 967)ACG_KRAS_30bp_guide_U- KRAS RESCUE ACG U/20 D30D GGAUCAUAUUCUUCCACAA34.013 125A flip_30_11 AAUGAUUCUGAA (SEQ ID NO: 968)ACG_KRAS_30bp_guide_U- KRAS RESCUE ACG U/18 D30D GUGGAUCAUAUUCUUCCAC22.047 125A flip_30_13 AAAAUGAUUCUG (SEQ ID NO: 969)GCG_KRAS_30bp_guide_U- KRAS RESCUE GCG U/24 G13G GGCCUACUCCACCAGCUCC0.476 125A flip_30_7 AACUACCACAAG (SEQ ID NO: 970)GCG_KRAS_30bp_guide_U- KRAS RESCUE GCG U/22 G13G GUUGCCUACUCCACCAGCU1.735 125A flip_30_9 CCAACUACCACA (SEQ ID NO: 971)GCG_KRAS_30bp_guide_U- KRAS RESCUE GCG U/20 G13G GUCUUGCCUACUCCACCAG2.180 125A flip_30_11 CUCCAACUACCA (SEQ ID NO: 972)GCG_KRAS_30bp_guide_U- KRAS RESCUE GCG U/18 G13G GACUCUUGCCUACUCCACC0.559 125A flip_30_13 AGCUCCAACUA (SEQ ID NO: 973)CCT_KRAS_30bp_guide_U- KRAS RESCUE CCU U/24 A18A GCGUCAAUGCACUCUUGCC1.694 125A flip_30_7 UACGCCACCAGC (SEQ ID NO: 974)CCT_KRAS_30bp_guide_U- KRAS RESCUE CCU U/22 A18A GAUCGUCAAUGCACUCUUG6.043 125A flip_30_9 CCUACGCCACCA (SEQ ID NO: 975)CCT_KRAS_30bp_guide_U- KRAS RESCUE CCU U/20 A18A GGUAUCGUCAAUGCACUCU9.465 125A flip_30_11 UGCCUACGCCAC (SEQ ID NO: 976)CCT_KRAS_30bp_guide_U- KRAS RESCUE CCU U/18 A18A GCUGUAUCGUCAAUGCACU3.147 125A flip_30_13 CUUGCCUACGCC (SEQ ID NO: 977)TCG_PPIB_30bp_guide_U- PPIB RESCUE UCG U/24 I18I GCCCCGCUAUGAGGGCGGC5.536 125A flip_30_7 GGCAAGGAGCAC (SEQ ID NO: 978)TCG_PPIB_30bp_guide_U- PPIB RESCUE UCG U/22 I18I GGACCCCGCUAUGAGGGCG8.914 125A flip_30_9 GCGGCAAGGAGC (SEQ ID NO: 979)TCG_PPIB_30bp_guide_U- PPIB RESCUE UCG U/20 I18I GCGGACCCCGCUAUGAGGG10.340 125A flip_30_11 CGGCGGCAAGGA (SEQ ID NO: 980)TCG_PPIB_30bp_guide_U- PPIB RESCUE UCG U/18 I18I GGACGGACCCCGCUAUGAG8.663 125A flip_30_13 GGCGGCGGCAAG (SEQ ID NO: 981)ACG_PPIB_30bp_guide_U- PPIB RESCUE ACG U/24 R7C GUGUUGCUUUCGGAGAGGC34.213 125A flip_30_7 GCAGCAUCCACA (SEQ ID NO: 982)ACG_PPIB_30bp_guide_U- PPIB RESCUE ACG U/22 R7C GCAUGUUGCUUUCGGAGAG31.652 125A flip_30_9 GCGCAGCAUCCA (SEQ ID NO: 983)ACG_PPIB_30bp_guide_U- PPIB RESCUE ACG U/20 R7C GUUCAUGUUGCUUUCGGAG26.969 125A flip_30_11 AGGCGCAGCAUC (SEQ ID NO: 984)ACG_PPIB_30bp_guide_U- PPIB RESCUE ACG U/18 R7C GCCUUCAUGUUGCUUUCGG22.539 125A flip_30_13 AGAGGCGCAGCA (SEQ ID NO: 985)GCG_PPIB_30bp_guide_U- PPIB RESCUE GCG U/24 A19V GGACCCCUCGAUGAGGGCG1.030 125A flip_30_7 GCGGCAAGGAGC (SEQ ID NO: 986)GCG_PPIB_30bp_guide_U- PPIB RESCUE GCG U/22 A19V GCGGACCCCUCGAUGAGGG4.819 125A flip_30_9 CGGCGGCAAGGA (SEQ ID NO: 987)GCG_PPIB_30bp_guide_U- PPIB RESCUE GCG U/20 A19V GGACGGACCCCUCGAUGAG4.778 125A flip_30_11 GGCGGCGGCAAG (SEQ ID NO: 988)GCG_PPIB_30bp_guide_U- PPIB RESCUE GCG U/18 A19V GAAGACGGACCCCUCGAUG1.115 125A flip_30_13 AGGGCGGCGGCA (SEQ ID NO: 989)CCG_PPIB_30bp_guide_U- PPIB RESCUE CCG U/24 S21S GGAAGACUGACCCCGCGAU3.150 125A flip_30_7 GAGGGCGGCGGC (SEQ ID NO: 990)CCG_PPIB_30bp_guide_U- PPIB RESCUE CCG U/22 S21S GAAGAAGACUGACCCCGCG2.659 125A flip_30_9 AUGAGGGCGGCG (SEQ ID NO: 991)CCG_PPIB_30bp_guide_U- PPIB RESCUE CCG U/20 S21S GGGAAGAAGACUGACCCCG6.101 125A flip_30_11 CGAUGAGGGCGG (SEQ ID NO: 992)CCG_PPIB_30bp_guide_U- PPIB RESCUE CCG U/18 S21S GCAGGAAGAAGACUGACCC4.372 125A flip_30_13 CGCGAUGAGGGC (SEQ ID NO: 993)TCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE UCG U/24 S85LGUCGUCCUACAUGCCCUUC 5.692 125A flip_30_7 UCAUGCAUGGAC (SEQ ID NO: 994)TCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE UCG U/22 S85LGGGUCGUCCUACAUGCCCU 6.807 125A flip_30_9 UCUCAUGCAUGG (SEQ ID NO: 995)TCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE UCG U/20 S85LGCGGGUCGUCCUACAUGCC 3.724 125A flip_30_11 CUUCUCAUGCAU (SEQ ID NO: 996)TCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE UCG U/18 S85LGCGCGGGUCGUCCUACAUG 2.274 125A flip_30_13 CCCUUCUCAUGC (SEQ ID NO: 997)ACG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE ACG U/24 D86DGCGGGUCUUCCGACAUGCC 3.689 125A flip_30_7 CUUCUCAUGCAU (SEQ ID NO: 998)ACG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE ACG U/22 D86DGCGCGGGUCUUCCGACAUG 4.868 125A flip_30_9 CCCUUCUCAUGC (SEQ ID NO: 999)ACG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE ACG U/20 D86DGAGCGCGGGUCUUCCGACA 6.943 125A flip_30_11 UGCCCUUCUCAU (SEQ ID NO: 1000)ACG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE ACG U/18 D86DGGUAGCGCGGGUCUUCCGA 5.785 125A flip_30_13 CAUGCCCUUCUC (SEQ ID NO: 1001)GCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE GCG U/24 R89CGUGUAGCUCGGGUCGUCCG 0.642 125A flip_30_7 ACAUGCCCUUCU (SEQ ID NO: 1002)GCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE GCG U/22 R89CGGUUGUAGCUCGGGUCGUC 1.808 125A flip_30_9 CGACAUGCCCUU (SEQ ID NO: 1003)GCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE GCG U/20 R89CGUGGUUGUAGCUCGGGUCG 2.277 125A flip_30_11 UCCGACAUGCCC (SEQ ID NO: 1004)GCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE GCG U/18 R89CGUCUGGUUGUAGCUCGGGU 1.323 125A flip_30_13 CGUCCGACAUGC (SEQ ID NO: 1005)CCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE CCG U/24 P88LGUAGCGCUGGUCGUCCGAC 4.412 125A flip_30_7 AUGCCCUUCUCA (SEQ ID NO: 1006)CCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE CCG U/22 P88LGUGUAGCGCUGGUCGUCCG 2.911 125A flip_30_9 ACAUGCCCUUCU (SEQ ID NO: 1007)CCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE CCG U/20 P88LGGUUGUAGCGCUGGUCGUC 4.819 125A flip_30_11 CGACAUGCCCUU (SEQ ID NO: 1008)CCG_SMARCA4_30bp_guide_U- SMARCA4 RESCUE CCG U/18 P88LGUGGUUGUAGCGCUGGUCG 0.841 125A flip_30_13 UCCGACAUGCCC (SEQ ID NO: 1009)S33F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/24 S33F GAUUCCACAGUCCAGGUAA7.351 125B flip_guide_30_7 GACUGUUGCUGC (SEQ ID NO: 1010)S33F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/22 S33F GGGAUUCCACAGUCCAGGU8.783 125B flip_guide_30_9 AAGACUGUUGCU (SEQ ID NO: 1011)S33F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/20 S33F GAUGGAUUCCACAGUCCAG6.063 125B flip_guide_30_11 GUAAGACUGUUG (SEQ ID NO: 1012)S33F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/18 S33F GGAAUGGAUUCCACAGUCC7.893 125B flip_guide_30_13 AGGUAAGACUGU (SEQ ID NO: 1013)H36Y_CTNNB1_30bp_C- CTNNB1 RESCUE CCA C/24 H36Y GCAGAAUCGAUUCCAGAGU0.406 125B flip_guide_30_7 CCAGGUAAGACU (SEQ ID NO: 1014)H36Y_CTNNB1_30bp_C- CTNNB1 RESCUE CCA C/22 H36Y GACCAGAAUCGAUUCCAGA1.178 125B flip_guide_30_9 GUCCAGGUAAGA (SEQ ID NO: 1015)H36Y_CTNNB1_30bp_C- CTNNB1 RESCUE CCA C/20 H36Y GGCACCAGAAUCGAUUCCA0.369 125B flip_guide_30_11 GAGUCCAGGUAA (SEQ ID NO: 1016)H36Y_CTNNB1_30bp_C- CTNNB1 RESCUE CCA C/18 H36Y GUGGCACCAGAAUCGAUUC0.589 125B flip_guide_30_13 CAGAGUCCAGGU (SEQ ID NO: 1017)S37F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/24 S37F GGCACCACAAUGGAUUCCA7.132 125B flip_guide_30_7 GAGUCCAGGUAA (SEQ ID NO: 1018)S37F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/22 S37F GUGGCACCACAAUGGAUUC8.465 125B flip_guide_30_9 CAGAGUCCAGGU (SEQ ID NO: 1019)S37F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/20 S37F GAGUGGCACCACAAUGGAU9.422 125B flip_guide_30_11 UCCAGAGUCCAG (SEQ ID NO: 1020)S37F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/18 S37F GGUAGUGGCACCACAAUGG9.734 125B flip_guide_30_13 AUUCCAGAGUCC (SEQ ID NO: 1021)T41I_CTNNB1_30bp_C- CTNNB1 RESCUE ACC C/24 T41I GGCUGUGCUAGUGGCACCA5.252 125B flip_guide_30_7 GAAUGGAUUCCA (SEQ ID NO: 1022)T41I_CTNNB1_30bp_C- CTNNB1 RESCUE ACC C/22 T41I GGAGCUGUGCUAGUGGCAC12.765 125B flip_guide_30_9 CAGAAUGGAUUC (SEQ ID NO: 1023)T41I_CTNNB1_30bp_C- CTNNB1 RESCUE ACC C/20 T41I GAGGAGCUGUGCUAGUGGC14.196 125B flip_guide_30_11 ACCAGAAUGGAU (SEQ ID NO: 1024)T41I_CTNNB1_30bp_C- CTNNB1 RESCUE ACC C/18 T41I GGAAGGAGCUGUGCUAGUG11.364 125B flip_guide_30_13 GCACCAGAAUGG (SEQ ID NO: 1025)P44L_CTNNB1_30bp_C- CTNNB1 RESCUE CCU C/24 P44L GAGAGAACGAGCUGUGGUA4.636 125B flip_guide_30_7 GUGGCACCAGAA (SEQ ID NO: 1026)P44L_CTNNB1_30bp_C- CTNNB1 RESCUE CCU C/22 P44L GUCAGAGAACGAGCUGUGG8.413 125B flip_guide_30_9 UAGUGGCACCAG (SEQ ID NO: 1027)P44L_CTNNB1_30bp_C- CTNNB1 RESCUE CCU C/20 P44L GACUCAGAGAACGAGCUGU8.276 125B flip_guide_30_11 GGUAGUGGCACC (SEQ ID NO: 1028)P44L_CTNNB1_30bp_C- CTNNB1 RESCUE CCU C/18 P44L GCCACUCAGAGAACGAGCU1.775 125B flip_guide_30_13 GUGGUAGUGGCA (SEQ ID NO: 1029)P445_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/24 P44S GGAGAAGCAGCUGUGGUAG15.256 125B flip_guide_30_7 UGGCACCAGAAU (SEQ ID NO: 1030)P445_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/22 P44S GCAGAGAAGCAGCUGUGGU14.639 125B flip_guide_30_9 AGUGGCACCAGA (SEQ ID NO: 1031)P445_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/20 P44S GCUCAGAGAAGCAGCUGUG13.489 125B flip_guide_30_11 GUAGUGGCACCA (SEQ ID NO: 1032)P445_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/18 P44S GCACUCAGAGAAGCAGCUG13.906 125B flip_guide_30_13 UGGUAGUGGCAC (SEQ ID NO: 1033)S45F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/24 S45F GCUCAGACAAGGAGCUGUG6.550 125B flip_guide_30_7 GUAGUGGCACCA (SEQ ID NO: 1034)S45F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/22 S45F GCACUCAGACAAGGAGCUG8.816 125B flip_guide_30_9 UGGUAGUGGCAC (SEQ ID NO: 1035)S45F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/20 S45F GACCACUCAGACAAGGAGC11.980 125B flip_guide_30_11 UGUGGUAGUGGC (SEQ ID NO: 1036)S45F_CTNNB1_30bp_C- CTNNB1 RESCUE UCU C/18 S45F GUUACCACUCAGACAAGGA7.397 125B flip_guide_30_13 GCUGUGGUAGUG (SEQ ID NO: 1037)TCG_KRAS_30bp_C- KRAS RESCUE UCG C/24 L56L GUGUGUCCAGAAUAUCCAA 13.024125B flip_guide_30_7 GAGACAGGUUUC (SEQ ID NO: 1038) TCG_KRAS_30bp_C-KRAS RESCUE UCG C/22 L56L GGCUGUGUCCAGAAUAUCC 8.716 125B flip_guide_30_9AAGAGACAGGUU (SEQ ID NO: 1039) TCG_KRAS_30bp_C- KRAS RESCUE UCG C/20L56L GCUGCUGUGUCCAGAAUAU 10.067 125B flip_guide_30_11CCAAGAGACAGG (SEQ ID NO: 1040) TCG_KRAS_30bp_C- KRAS RESCUE UCG C/18L56L GACCUGCUGUGUCCAGAAU 6.656 125B flip_guide_30_13AUCCAAGAGACA (SEQ ID NO: 1041) ACG_KRAS_30bp_C- KRAS RESCUE ACG C/24D30D GAUAUUCCUCCACAAAAUG 13.954 125B flip_guide_30_7AUUCUGAAUUAG (SEQ ID NO: 1042) ACG_KRAS_30bp_C- KRAS RESCUE ACG C/22D30D GUCAUAUUCCUCCACAAAA 12.962 125B flip_guide_30_9UGAUUCUGAAUU (SEQ ID NO: 1043) ACG_KRA5_30bp_C- KRAS RESCUE ACG C/20D30D GGAUCAUAUUCCUCCACAA 23.138 125B flip_guide_30_11AAUGAUUCUGAA (SEQ ID NO: 1044) ACG_KRAS_30bp_C- KRAS RESCUE ACG C/18D30D GUGGAUCAUAUUCCUCCAC 16.629 125B flip_guide_30_13AAAAUGAUUCUG (SEQ ID NO: 1045) GCG_KRAS_30bp_C- KRAS RESCUE GCG C/24G13G GGCCUACCCCACCAGCUCC 0.394 125B flip_guide_30_7 AACUACCACAAG (SEQ IDNO: 1046) GCG_KRAS_30bp_C- KRAS RESCUE GCG C/22 G13G GUUGCCUACCCCACCAGCU0.886 125B flip_guide_30_9 CCAACUACCACA (SEQ ID NO: 1047)GCG_KRAS_30bp_C- KRAS RESCUE GCG C/20 G13G GUCUUGCCUACCCCACCAG 0.950125B flip_guide_30_11 CUCCAACUACCA (SEQ ID NO: 1048) GCG_KRAS_30bp_C-KRAS RESCUE GCG C/18 G13G GACUCUUGCCUACCCCACC 0.311 125Bflip_guide_30_13 AGCUCCAACUAC (SEQ ID NO: 1049) CCT_KRAS_30bp_C- KRASRESCUE CCU C/24 A18A GCGUCAACGCACUCUUGCC 1.407 125B flip_guide_30_7UACGCCACCAGC (SEQ ID NO: 1050) CCT_KRAS_30bp_C- KRAS RESCUE CCU C/22A18A GAUCGUCAACGCACUCUUG 5.431 125B flip_guide_30_9 CCUACGCCACCA (SEQ IDNO: 1051) CCT_KRAS_30bp_C- KRAS RESCUE CCU C/20 A18A GGUAUCGUCAACGCACUCU7.769 125B flip_guide_30_11 UGCCUACGCCAC (SEQ ID NO: 1052)CCT_KRAS_30bp_C- KRAS RESCUE CCU C/18 A18A GCUGUAUCGUCAACGCACU 1.685125B flip_guide_30_13 CUUGCCUACGCC (SEQ ID NO: 1053) TCG_PPIB_30bp_C-PPIB RESCUE UCG C/24 I18I GCCCCGCCAUGAGGGCGGC 4.128 125B flip_guide_30_7GGCAAGGAGCAC (SEQ ID NO: 1054) TCG_PP1B_30bp_C- PPIB RESCUE UCG C/22I18I GGACCCCGCCAUGAGGGCG 6.324 125B flip_guide_30_9 GCGGCAAGGAGC (SEQ IDNO: 1055) TCG_PPIB_30bp_C- PPIB RESCUE UCG C/20 I18I GCGGACCCCGCCAUGAGGG6.386 125B flip_guide_30_11 CGGCGGCAAGGA (SEQ ID NO: 1056)TCG_PPIB_30bp_C- PPIB RESCUE UCG C/18 I18I GGACGGACCCCGCCAUGAG 5.348125B flip_guide_30_13 GGCGGCGGCAAG (SEQ ID NO: 1057) ACG_PPIB_30bp_C-PPIB RESCUE ACG C/24 R7C GUGUUGCCUUCGGAGAGGC 28.575 125B flip_guide_30_7GCAGCAUCCACA (SEQ ID NO: 1058) ACG_PPIB_30bp_C- PPIB RESCUE ACG C/22 R7CGCAUGUUGCCUUCGGAGAG 24.992 125B flip_guide_30_9 GCGCAGCAUCCA (SEQ IDNO: 1059) ACG_PPIB_30bp_C- PPIB RESCUE ACG C/20 R7C GUUCAUGUUGCCUUCGGAG24.160 125B flip_guide_30_11 AGGCGCAGCAUC (SEQ ID NO: 1060)ACG_PPIB_30bp_C- PPIB RESCUE ACG C/18 R7C GCCUUCAUGUUGCCUUCGG 5.582 125Bflip_guide_30_13 AGAGGCGCAGCA (SEQ ID NO: 1061) GCG_PPIB_30bp_C- PPIBRESCUE GCG C/24 A19V GGACCCCCCGAUGAGGGCG 0.584 125B flip_guide_30_7GCGGCAAGGAGC (SEQ ID NO: 1062) GCG_PPIB_30bp_C- PPIB RESCUE GCG C/22A19V GCGGACCCCCCGAUGAGGG 5.708 125B flip_guide_30_9 CGGCGGCAAGGA (SEQ IDNO: 1063) GCG_PPIB_30bp_C- PPIB RESCUE GCG C/20 A19V GGACGGACCCCCCGAUGAG3.700 125B flip_guide_30_11 GGCGGCGGCAAG (SEQ ID NO: 1064)GCG_PPIB_30bp_C- PPIB RESCUE GCG C/18 A19V GAAGACGGACCCCCCGAUG 0.667125B flip_guide_30_13 AGGGCGGCGGCA (SEQ ID NO: 1065) CCG_PPIB_30bp_C-PPIB RESCUE CCG C/24 S21S GGAAGACCGACCCCGCGAU 3.719 125B flip_guide_30_7GAGGGCGGCGGC (SEQ ID NO: 1066) CCG_PPIB_30bp_C- PPIB RESCUE CCG C/22S21S GAAGAAGACCGACCCCGCG 4.255 125B flip_guide_30_9 AUGAGGGCGGCG (SEQ IDNO: 1067) CCG_PPIB_30bp_C- PPIB RESCUE CCG C/20 S21S GGGAAGAAGACCGACCCCG6.843 125B flip_guide_30_11 CGAUGAGGGCGG (SEQ ID NO: 1068)CCG_PPIB_30bp_C- PPIB RESCUE CCG C/18 S21S GCAGGAAGAAGACCGACCC 4.263125B flip_guide_30_13 CGCGAUGAGGGC (SEQ ID NO: 1069) TCG_SMARCA4_30bp_C-SMARCA4 RESCUE UCG C/24 S85L GUCGUCCCACAUGCCCUUC 4.081 125Bflip_guide_30_7 UCAUGCAUGGAC (SEQ ID NO: 1070) TCG_SMARCA4_30bp_C-SMARCA4 RESCUE UCG C/22 S85L GGGUCGUCCCACAUGCCCU 3.132 125Bflip_guide_30_9 UCUCAUGCAUGG (SEQ ID NO: 1071) TCG_SMARCA4_30bp_C-SMARCA4 RESCUE UCG C/20 S85L GCGGGUCGUCCCACAUGCC 0.918 125Bflip_guide_30_11 CUUCUCAUGCAU (SEQ ID NO: 1072) TCG_SMARCA4_30bp_C-SMARCA4 RESCUE UCG C/18 S85L GCGCGGGUCGUCCCACAUG 1.830 125Bflip_guide_30_13 CCCUUCUCAUGC (SEQ ID NO: 1073) ACG_SMARCA4_30bp_C-SMARCA4 RESCUE ACG C/24 D86D GCGGGUCCUCCGACAUGCC 2.425 125Bflip_guide_30_7 CUUCUCAUGCAU (SEQ ID NO: 1074) ACG_SMARCA4_30bp_C-SMARCA4 RESCUE ACG C/22 D86D GCGCGGGUCCUCCGACAUG 3.392 125Bflip_guide_30_9 CCCUUCUCAUGC (SEQ ID NO: 1075) ACG_SMARCA4_30bp_C-SMARCA4 RESCUE ACG C/20 D86D GAGCGCGGGUCCUCCGACA 4.485 125Bflip_guide_30_11 UGCCCUUCUCAU (SEQ ID NO: 1076) ACG_SMARCA4_30bp_C-SMARCA4 RESCUE ACG C/18 D86D GGUAGCGCGGGUCCUCCGA 3.071 125Bflip_guide_30_13 CAUGCCCUUCUC (SEQ ID NO: 1077) GCG_SMARCA4_30bp_C-SMARCA4 RESCUE GCG C/24 R89C GUGUAGCCCGGGUCGUCCG 0.225 125Bflip_guide_30_7 ACAUGCCCUUCU (SEQ ID NO: 1078) GCG_SMARCA4_30bp_C-SMARCA4 RESCUE GCG C/22 R89C GGUUGUAGCCCGGGUCGUC 1.026 125Bflip_guide_30_9 CGACAUGCCCUU (SEQ ID NO: 1079) GCG_SMARCA4_30bp_C-SMARCA4 RESCUE GCG C/20 R89C GUGGUUGUAGCCCGGGUCG 1.737 125Bflip_guide_30_11 UCCGACAUGCCC (SEQ ID NO: 1080) GCG_SMARCA4_30bp_C-SMARCA4 RESCUE GCG C/18 R89C GUCUGGUUGUAGCCCGGGU 0.603 125Bflip_guide_30_13 CGUCCGACAUGC (SEQ ID NO: 1081) CCG_SMARCA4_30bp_C-SMARCA4 RESCUE CCG C/24 P88L GUAGCGCCGGUCGUCCGAC 2.895 125Bflip_guide_30_7 AUGCCCUUCUCA (SEQ ID NO: 1082) CCG_SMARCA4_30bp_C-SMARCA4 RESCUE CCG C/22 P88L GUGUAGCGCCGGUCGUCCG 2.766 125Bflip_guide_30_9 ACAUGCCCUUCU (SEQ ID NO: 1083) CCG_SMARCA4_30bp_C-SMARCA4 RESCUE CCG C/20 P88L GGUUGUAGCGCCGGUCGUC 4.845 125Bflip_guide_30_11 CGACAUGCCCUU (SEQ ID NO: 1084) CCG_SMARCA4_30bp_C-SMARCA4 RESCUE CCG C/18 P88L GUGGUUGUAGCGCCGGUCG 0.684 125Bflip_guide_30_13 UCCGACAUGCCC (SEQ ID NO: 1085) NRAS_30bp_C- NRAS RESCUEUCC C/28 I21I GUGCAUUGUCAGUGCGCUU 2.714 125C flip_guide_30_3UUCCCAACACCA (SEQ ID NO: 1086) NRAS_30bp_C- NRAS RESCUE UCC C/26 I21IGGCUGCAUUGUCAGUGCGC 8.839 125C flip_guide_30_5 UUUUCCCAACAC (SEQ IDNO: 1087) NRAS_30bp_C- NRAS RESCUE UCC C/24 I21I GUAGCUGCAUUGUCAGUGC10.690 125C flip_guide_30_7 GCUUUUCCCAAC (SEQ ID NO: 1088) NRAS_30bp_C-NRAS RESCUE UCC C/22 I21I GAUUAGCUGCAUUGUCAGU 13.278 125Cflip_guide_30_9 GCGCUUUUCCCA (SEQ ID NO: 1089) NRAS_30bp_C- NRAS RESCUEUCC C/20 I21I GGGAUUAGCUGCAUUGUCA 18.858 125C flip_guide_30_11GUGCGCUUUUCC (SEQ ID NO: 1090) NKFB1_30bp_C- NKFB1 RESCUE ACC C/28 P33SGUGCUUGAAAUACUUCUGG 16.138 125C flip_guide_30_3 AUUAAAUAUUGU (SEQ IDNO: 1091) NKFB1_30bp_C- NKFB1 RESCUE ACC C/26 P33S GUGUGCUUGAAAUACUUCU9.580 125C flip_guide_30_5 GGAUUAAAUAUU (SEQ ID NO: 1092) NKFB1_30bp_C-NKFB1 RESCUE ACC C/24 P33S GUCUGUGCUUGAAAUACUU 14.701 125Cflip_guide_30_7 CUGGAUUAAAUA (SEQ ID NO: 1093) NKFB1_30bp_C- NKFB1RESCUE ACC C/22 P33S GCAUCUGUGCUUGAAAUAC 13.808 125C flip_guide_30_9UUCUGGAUUAAA (SEQ ID NO: 1094) NKFB1_30bp_C- NKFB1 RESCUE ACC C/20 P33SGGCCAUCUGUGCUUGAAAU 21.529 125C flip_guide_30_11 ACUUCUGGAUUA (SEQ IDNO: 1095) EZH2_30bp_C- EZH2 RESCUE UCA C/28 F32F GCUCAACCUCUUGAGCUGU2.696 125C flip_guide_30_3 CUCAGUCGCAUG (SEQ ID NO: 1096) EZH2_30bp_C-EZH2 RESCUE UCA C/26 F32F GGUCUCAACCUCUUGAGCU 0.106 125C flip_guide_30_5GUCUCAGUCGCA (SEQ ID NO: 1097) EZH2_30bp_C- EZH2 RESCUE UCA C/24 F32FGUCGUCUCAACCUCUUGAG 11.539 125C flip_guide_30_7 CUGUCUCAGUCG (SEQ IDNO: 1098) EZH2_30bp_C- EZH2 RESCUE UCA C/22 F32F GGCUCGUCUCAACCUCUUG10.710 125C flip_guide_30_9 AGCUGUCUCAGU (SEQ ID NO: 1099) EZH2_30bp_C-EZH2 RESCUE UCA C/20 F32F GCAGCUCGUCUCAACCUCU 15.855 125Cflip_guide_30_11 UGAGCUGUCUCA (SEQ ID NO: 1100) NF2_30bp_C- NF2 RESCUEACG C/28 T21M GACCUCUUGGGUUGCUUCC 8.318 125C flip_guide_30_3UCUUGAGAGAGC (SEQ ID NO: 1101) NF2_30bp_C- NF2 RESCUE ACG C/26 T21MGGAACCUCUUGGGUUGCUU 18.846 125C flip_guide_30_5 CCUCUUGAGAGA (SEQ IDNO: 1102) NF2_30bp_C- NF2 RESCUE ACG C/24 T21M GGUGAACCUCUUGGGUUGC18.617 125C flip_guide_30_7 UUCCUCUUGAGA (SEQ ID NO: 1103) NF2_30bp_C-NF2 RESCUE ACG C/22 T21M GCGGUGAACCUCUUGGGUU 24.904 125C flip_guide_30_9GCUUCCUCUUGA (SEQ ID NO: 1104) NF2_30bp_C- NF2 RESCUE ACG C/20 T21MGCACGGUGAACCUCUUGGG 14.735 125C flip_guide_30_11 UUGCUUCCUCUU (SEQ IDNO: 1105) RAF1_30bp_C- RAF1 RESCUE UCC C/28 P30S GAGCAGAGAUGCAGCUGGA8.257 125C flip_guide_30_3 GCCAUCAAACAC (SEQ ID NO: 1106) RAF1_30bp_C-RAF1 RESCUE UCC C/26 P30S GGUAGCAGAGAUGCAGCUG 14.435 125Cflip_guide_30_5 GAGCCAUCAAAC (SEQ ID NO: 1107) RAF1_30bp_C- RAF1 RESCUEUCC C/24 P30S GUUGUAGCAGAGAUGCAGC 20.867 125C flip_guide_30_7UGGAGCCAUCAA (SEQ ID NO: 1108) RAF1_30bp_C- RAF1 RESCUE UCC C/22 P30SGUAUUGUAGCAGAGAUGCA 13.821 125C flip_guide_30_9 GCUGGAGCCAUC (SEQ IDNO: 1109) RAF1_30bp_C- RAF1 RESCUE UCC C/20 P30S GACUAUUGUAGCAGAGAUG16.277 125C flip_guide_30_11 CAGCUGGAGCCA (SEQ ID NO: 1110) NRAS_30bp_U-NRAS RESCUE UCC U/28 I21I GUGUAUUGUCAGUGCGCUU 3.500 125C flip_guide_30_3UUCCCAACACCA (SEQ ID NO: 1111) NRAS_30bp_U- NRAS RESCUE UCC U/26 I21IGGCUGUAUUGUCAGUGCGC 8.805 125C flip_guide_30_5 UUUUCCCAACAC (SEQ IDNO: 1112) NRAS_30bp_U- NRAS RESCUE UCC U/24 I21I GUAGCUGUAUUGUCAGUGC12.296 125C flip_guide_30_7 GCUUUUCCCAAC (SEQ ID NO: 1113) NRAS_30bp_U-NRAS RESCUE UCC U/22 I21I GAUUAGCUGUAUUGUCAGU 12.521 125Cflip_guide_30_9 GCGCUUUUCCCA (SEQ ID NO: 1114) NRAS_30bp_U- NRAS RESCUEUCC U/20 I21I GGGAUUAGCUGUAUUGUCA 18.399 125C flip_guide_30_11GUGCGCUUUUCC (SEQ ID NO: 1115) NKFB1_30bp_U- NKFB1 RESCUE ACC U/28 P33SGUGUUUGAAAUACUUCUGG 14.277 125C flip_guide_30_3 AUUAAAUAUUGU (SEQ IDNO: 1116) NKFB1_30bp_U- NKFB1 RESCUE ACC U/26 P33S GUGUGUUUGAAAUACUUCU10.928 125C flip_guide_30_5 GGAUUAAAUAUU (SEQ ID NO: 1117) NKFB1_30bp_U-NKFB1 RESCUE ACC U/24 P33S GUCUGUGUUUGAAAUACUU 18.012 125Cflip_guide_30_7 CUGGAUUAAAUA (SEQ ID NO: 1118) NKFB1_30bp_U- NKFB1RESCUE ACC U/22 P33S GCAUCUGUGUUUGAAAUAC 21.468 125C flip_guide_30_9UUCUGGAUUAAA (SEQ ID NO: 1119) NKFB1_30bp_U- NKFB1 RESCUE ACC U/20 P33SGGCCAUCUGUGUUUGAAAU 24.376 125C flip_guide_30_11 ACUUCUGGAUUA (SEQ IDNO: 1120) EZH2_30bp_U- EZH2 RESCUE UCA U/28 F32F GCUUAACCUCUUGAGCUGU9.307 125C flip_guide_30_3 CUCAGUCGCAUG (SEQ ID NO: 1121) EZH2_30bp_U-EZH2 RESCUE UCA U/26 F32F GGUCUUAACCUCUUGAGCU 9.393 125C flip_guide_30_5GUCUCAGUCGCA (SEQ ID NO: 1122) EZH2_30bp_U- EZH2 RESCUE UCA U/24 F32FGUCGUCUUAACCUCUUGAG 8.525 125C flip_guide_30_7 CUGUCUCAGUCG (SEQ IDNO: 1123) EZH2_30bp_U- EZH2 RESCUE UCA U/22 F32F GGCUCGUCUUAACCUCUUG6.976 125C flip_guide_30_9 AGCUGUCUCAGU (SEQ ID NO: 1124) EZH2_30bp_U-EZH2 RESCUE UCA U/20 F32F GCAGCUCGUCUUAACCUCU 9.534 125Cflip_guide_30_11 UGAGCUGUCUCA (SEQ ID NO: 1125) NF2_30bp_U- NF2 RESCUEACG U/28 T21M GACUUCUUGGGUUGCUUCC 7.253 125C flip_guide_30_3UCUUGAGAGAGC (SEQ ID NO: 1126) NF2_30bp_U- NF2 RESCUE ACG U/26 T21MGGAACUUCUUGGGUUGCUU 16.618 125C flip_guide_30_5 CCUCUUGAGAGA (SEQ IDNO: 1127) NF2_30bp_U- NF2 RESCUE ACG U/24 T21M GGUGAACUUCUUGGGUUGC15.696 125C flip_guide_30_7 UUCCUCUUGAGA (SEQ ID NO: 1128) NF2_30bp_U-NF2 RESCUE ACG U/22 T21M GCGGUGAACUUCUUGGGUU 18.984 125C flip_guide_30_9GCUUCCUCUUGA (SEQ ID NO: 1129) NF2_30bp_U- NF2 RESCUE ACG U/20 T21MGCACGGUGAACUUCUUGGG 16.393 125C flip_guide_30_11 UUGCUUCCUCUU (SEQ IDNO: 1130) RAF1_30bp_U- RAF1 RESCUE UCC U/28 P30S GAGUAGAGAUGCAGCUGGA9.450 125C flip_guide_30_3 GCCAUCAAACAC (SEQ ID NO: 1131) RAF1_30bp_U-RAF1 RESCUE UCC U/26 P30S GGUAGUAGAGAUGCAGCUG 15.056 125Cflip_guide_30_5 GAGCCAUCAAAC (SEQ ID NO: 1132) RAF1_30bp_U- RAF1 RESCUEUCC U/24 P30S GUUGUAGUAGAGAUGCAGC 20.868 125C flip_guide_30_7UGGAGCCAUCAA (SEQ ID NO: 1133) RAF1_30bp_U- RAF1 RESCUE UCC U/22 P30SGUAUUGUAGUAGAGAUGCA 15.518 125C flip_guide_30_9 GCUGGAGCCAUC (SEQ IDNO: 1134) RAF1_30bp_U- RAF1 RESCUE UCC U/20 P30S GACUAUUGUAGUAGAGAUG20.003 125C flip_guide_30_11 CAGCUGGAGCCA (SEQ ID NO: 1135)STAT3 S727 C-flip STAT3 RESCUE UCC C/22 S727F GUGCGGGGGCACAUCGGCA 6.127129B GGUCAAUGGUAU (SEQ ID NO: 1136) STAT3 S727 U-flip STAT3 RESCUE UCCU/22 S727F GUGCGGGGGUACAUCGGCA 3.132 129B GGUCAAUGGUAU (SEQ ID NO: 1137)STAT3 S727 G-flip STAT3 RESCUE UCC G/22 S727F GUGCGGGGGGACAUCGGCA 0.154129B GGUCAAUGGUAU (SEQ ID NO: 1138) STAT3 S727 A-flip STAT3 RESCUE UCCA/22 S727F GUGCGGGGGAACAUCGGCA 0.156 129B GGUCAAUGGUAU (SEQ ID NO: 1139)STAT1 S727 C-flip STAT1 RESCUE UCC C/22 S727F GCCUCAGGACACAUGGGGA 1.141129D GCAGGUUGUCUG (SEQ ID NO: 1140) STAT1 S727 U-flip STAT1 RESCUE UCCU/22 S727F GCCUCAGGAUACAUGGGGA 0.801 129D GCAGGUUGUCUG (SEQ ID NO: 1141)STAT1 S727 G-flip STAT1 RESCUE UCC G/22 S727F GCCUCAGGAGACAUGGGGA 0.223129D GCAGGUUGUCUG (SEQ ID NO: 1142) STAT1 S727 A-flip STAT1 RESCUE UCCA/22 S727F GCCUCAGGAAACAUGGGGA 0.193 129D GCAGGUUGUCUG (SEQ ID NO: 1143)STAT1 Y701 C-flip STAT1 REPAIR UAU C/34 Y701C GCAACUCAGUCUUGAUACA 1.289129E UCCAGUUCCUUUAGGGCCA UCAAGUUCCAUUG (SEQ ID NO: 1144)STAT1 Y701 U-flip STAT1 REPAIR UAU U/34 Y701C GCAACUCAGUCUUGAUAUA 0.050129E UCCAGUUCCUUUAGGGCCA UCAAGUUCCAUUG (SEQ ID NO: 1145)STAT1 Y701 G-flip STAT1 REPAIR UAU G/34 Y701C GCAACUCAGUCUUGAUAGA 0.011129E UCCAGUUCCUUUAGGGCCA UCAAGUUCCAUUG (SEQ ID NO: 1146)STAT3 Y705 C-flip STAT3 REPAIR UAC C/34 Y705C GAAACUUGGUCUUCAGGCA 5.080139B UGGGGCAGCGCUACCUGGG UCAGCUUCAGGAU (SEQ ID NO: 1147) STAT3_S727_30_5STAT3 RESCUE UCC C/26 S727F GGGGGACAUCGGCAGGUCA 4.842 139CAUGGUAUUGCU (SEQ ID NO: 1148) STAT3_S727_30_7 STAT3 RESCUE UCC C/24S727F CGGGGGGACAUCGGCAGGU 2.990 139C CAAUGGUAUUG (SEQ ID NO: 1149)STAT3_S727_30_11 STAT3 RESCUE UCC C/20 S727F AGUGCGGGGGGACAUCGGC 6.786139C AGGUCAAUGGU (SEQ ID NO: 1150) STAT1_S727_30_5 STAT1 RESCUE UCC C/26S727F AGGAGACAUGGGGAGCAGG 2.128 139C UUGUCUGUGGU (SEQ ID NO: 1151)STAT1_S727_30_7 STAT1 RESCUE UCC C/24 S727F UCAGGAGACAUGGGGAGCA 3.215139C GGUUGUCUGUG (SEQ ID NO: 1152) STAT1_S727_30_11 STAT1 RESCUE UCCC/20 S727F CUCCUCAGGAGACAUGGGG 2.146 139C AGCAGGUUGUC (SEQ ID NO: 1153)

TABLE 31 Guide sequences used for synthetic target editing Editing Basepercent- Target- flip/ age ed REPAIR/ posi- Codon (first First Name geneRESCUE Motif tion change Spacer sequence figure) figure NM_000016.5_C-ACADM RESCUE ACA C/7 H67Y GUAUCAUCUUCUGCAGCC 7.500 127A flip_guideACUGGGAUGAUUU (SEQ ID NO: 1154) NM_000018.3_C- ACADVL RESCUE GCG C/9A283V GUCUCCACCCCAAAAGCU 6.274 127A flip_guide GUGAUCUUCUCCU (SEQID NO: 1155) NM_000071.2_C- CBS RESCUE GCG C/9 R109C GGAACUCACCCUUGGCCA6.140 127A flip_guide AGAGCUCACACUU (SEQ ID NO: 1156) NM_000138.4_C-FBN1 RESCUE GCG C/5 R1408C GGAGCCCUCAUCAAGGUC 13.751 127A flip_guideUGUACAAGUGAAG (SEQ ID NO: 1157) NM_000141.4_C- FGFR2 RESCUE CCC C/7P267S GCUGUGGCGGCAUUUGCC 16.131 127A flip_guide GGCAGUCCGGCUU (SEQID NO: 1158) NM_000152.4_C- GAA RESCUE CCC C/7 P552L GGCCUGGCGGGUCCCCCC2.644 127A flip_guide AACCACCCCAGGC (SEQ ID NO: 1159) NM_000341.3_C-SLC3A1 RESCUE ACG C/7 T467M GAGAAGCCUGUUCAUCAC 28.198 127A flip_guideGUUGACAUACUGA (SEQ ID NO: 1160) NM_000375.2_C- UROS RESCUE ACG C/9 R73CGCUCCAAACCUAACUCUG 13.632 127A flip_guide CUGCUUCCACUGC (SEQID NO: 1161) NM_000431.3_C- MVK RESCUE ACA C/9 T268I GUGGCAUCUCUUGAGGUC13.775 127A flip_guide AGGAGGGGGGCCA (SEQ ID NO: 1162) NM_000551.3_C-VHL RESCUE CCG C/7 P158L GUCUUUCCGAGUAUACAC 22.466 127A flip_guideUGGCAGUGUGAUA (SEQ ID NO: 1163) NM_001256850.1_C- TTN RESCUE ACG C/9R30071C GCUUUCCACCUGGGCCAG 23.207 127A flip_guide GGGAAUCAAGCAC (SEQID NO: 1164) NM_002397.4_C- MEF2C RESCUE ACG C/9 T1M GUUCUCCCCCUAGUCCCC2.313 127A flip_guide GUUUUUCUUCUCU (SEQ ID NO: 1165) NM_002474.2_C-MYH11 RESCUE CCG C/9 P1264L GUGGACUGCCGCUCCUGC 34.770 127A flip_guideACCUGCGCCUCCA (SEQ ID NO: 1166) NM_002834.4_C- PTPN11 RESCUE CCU C/9L285F GAUGAUCAACGGGCAGGA 0.711 127A flip_guide UGUUUUUAUAUCU (SEQID NO: 1167) NM_004004.5_C- GJB2 RESCUE ACG C/5 R77W GGCCCCUAGCCGGAUGUG5.347 1127A  flip_guide GGAGAUGGGGAAG (SEQ ID NO: 1168) NM_004572.3_C-PKP2 RESCUE CCG C/9 R796C GUGUGUAACCGGCAGAGG 20.024 127A flip_guideCUGUAGUUUCAAU (SEQ ID NO: 1169) NM_005609.3_C- PYGM RESCUE GCG C/9 R798WGCCGCGUCCCCUCUCUUG 5.946 127A flip_guide GGUUCUUGUACAA (SEQ ID NO: 1170)NM_005633.3_C- SOS1 RESCUE ACG C/9 T269M GCAUCUGUCCUUUCUACU 39.582 127Aflip_guide GUAUCUUCUAUAU (SEQ ID NO: 1171) NM_014139.2_C- SCN11A RESCUECCG C/9 P396L GCAACAGCCCGGGUUAAG 32.565 127A flip_guideUUAAUCAGGUAGA (SEQ ID NO: 1172) NM_014874.3_C- MFN2 RESCUE CCG C/9 P76LGCAACAGCCCGGGUUAAG 20.769 127A flip_guide UUAAUCAGGUAGA (SEQID NO: 1173) NM_015559.2_C- SETBP1 RESCUE ACU C/7 T871IGGUCCCACUGCCGCUGUC 20.136 127A flip_guide GCUGGGGAUCGUC (SEQID NO: 1174) NM_020630.4_C- RET RESCUE CCG C/5 R620C GUCGCCGAAGCACUUCUC15.762 127A flip_guide CUCCUCAGGGAAG (SEQ ID NO: 1175)NM_000016.5_F_30bp_C- ACADM RESCUE ACA C/5 H67Y GUCAUCUUCUGCAGCCAC 2.900127B flip_guide_30_5 UGGGAUGAUUUCC (SEQ ID NO: 1176)NM_000016.5_F_30bp_C- ACADM RESCUE ACA C/7 H67Y GUAUCAUCUUCUGCAGCC 7.500127B flip_guide_30_7 ACUGGGAUGAUUU (SEQ ID NO: 1177)NM_000016.5_F_30bp_C- ACADM RESCUE ACA C/9 H67Y GUUUAUCAUCUUCUGCAG 6.477127B flip_guide_30_9 CCACUGGGAUGAU (SEQ ID NO: 1178)NM_000018.3_F_30bp_C- ACADVL RESCUE GCG C/5 A283V GCACCCCAAAAGCUGUGA4.474 127B flip_guide_30_5 UCUUCUCCUUCAC (SEQ ID NO: 1178)NM_000018.3_F_30bp_C- ACADVL RESCUE GCG C/7 A283V GUCCACCCCAAAAGCUGU5.357 127B flip_guide_30_7 GAUCUUCUCCUUC (SEQ ID NO: 1179)NM_000018.3_F_30bp_C- ACADVL RESCUE GCG C/9 A283V GUCUCCACCCCAAAAGCU6.274 127B flip_guide_30_9 GUGAUCUUCUCCU (SEQ ID NO: 1180)NM_000071.2_F_30bp_C- CBS RESCUE GCG C/5 R109C GUCACCCUUGGCCAAGAG 1.269127B flip_guide_30_5 CUCACACUUCAGG (SEQ ID NO: 1181)NM_000071.2_F_30bp_C- CBS RESCUE GCG C/7 R109C GACUCACCCUUGGCCAAG 2.346127B flip_guide_30_7 AGCUCACACUUCA (SEQ ID NO: 1182)NM_000071.2_F_30bp_C- CBS RESCUE GCG C/9 R109C GGAACUCACCCUUGGCCA 6.140127B flip_guide_30_9 AGAGCUCACACUU (SEQ ID NO: 1183)NM_000138.4_F_30bp_C- FBN1 RESCUE GCG C/5 R1408C GGAGCCCUCAUCAAGGUC13.751 127B flip_guide_30_5 UGUACAAGUGAAG (SEQ ID NO: 1184)NM_000138.4_F_30bp_C- FBN1 RESCUE GCG C/7 R1408C GCAGAGCCCUCAUCAAGG12.998 127B flip_guide_30_7 UCUGUACAAGUGA (SEQ ID NO: 1185)NM_000138.4_F_30bp_C- FBN1 RESCUE GCG C/9 R1408C GCUCAGAGCCCUCAUCAA10.845 127B flip_guide_30_9 GGUCUGUACAAGU (SEQ ID NO: 1186)NM_000141.4_F_30bp_C- FGFR2 RESCUE CCC C/5 P267S GGUGGCGGCAUUUGCCGG5.882 127B flip_guide_30_5 CAGUCCGGCUUGG (SEQ ID NO: 1187)NM_000141.4_F_30bp_C- FGFR2 RESCUE CCC C/7 P267S GCUGUGGCGGCAUUUGCC16.131 127B flip_guide_30_7 GGCAGUCCGGCUU (SEQ ID NO: 1188)NM_000141.4_F_30bp_C- FGFR2 RESCUE CCC C/9 P267S GCACUGUGGCGGCAUUUG13.994 127B flip_guide_30_9 CCGGCAGUCCGGC (SEQ ID NO: 1189)NM_000152.4_F_30bp_C- GAA RESCUE CCC C/5 P552L GCUGGCGGGUCCCCCCAA 2.316127B flip_guide_30_5 CCACCCCAGGCAC (SEQ ID NO: 1190)NM_000152.4_F_30bp_C- GAA RESCUE CCC C/7 P552L GGCCUGGCGGGUCCCCCC 2.644127B flip_guide_30_7 AACCACCCCAGGC (SEQ ID NO: 1191)NM_000152.4_F_30bp_C- GAA RESCUE CCC C/9 P552L GCCGCCUGGCGGGUCCCC 0.168127B flip_guide_30_9 CCAACCACCCCAG (SEQ ID NO: 1192)NM_000341.3 F_30bp_C- SLC3A1 RESCUE ACG C/5 T467M GAAGCCUGUUCAUCACGU26.164 127B flip_guide_30_5 UGACAUACUGAUU (SEQ ID NO: 1193)NM_000341.3 F_30bp_C- SLC3A1 RESCUE ACG C/7 T467M GAGAAGCCUGUUCAUCAC28.198 127B flip_guide_30_7 GUUGACAUACUGA (SEQ ID NO: 1194)NM_000341.3_F_30bp_C- SLC3A1 RESCUE ACG C/9 T467M GAAAGAAGCCUGUUCAUC24.276 127B flip_guide_30_9 ACGUUGACAUACU (SEQ ID NO: 1195)NM_000375.2_F_30bp_C- UROS RESCUE ACG C/5 R73C GAAACCUAACUCUGCUGC 8.539127B flip_guide_30_5 UUCCACUGCUCUG (SEQ ID NO: 1196)NM_000375.2_F_30bp_C- UROS RESCUE ACG C/7 R73C GCCAAACCUAACUCUGCU 7.367127B flip_guide_30_7 GCUUCCACUGCUC (SEQ ID NO: 1197)NM_000375.2_F_30bp_C- UROS RESCUE ACG C/9 R73C GCUCCAAACCUAACUCUG 13.632127B flip_guide_30_9 CUGCUUCCACUGC (SEQ ID NO: 1198)NM_000431.3 F_30bp_C- MVK RESCUE ACA C/5 T268I GAUCUCUUGAGGUCAGGA 2.858127B flip_guide_30_5 GGGGGGCCACGAU (SEQ ID NO: 1199)NM_000431.3_F_30bp_C- MVK RESCUE ACA C/7 T268I GGCAUCUCUUGAGGUCAG 13.008127B flip_guide_30_7 GAGGGGGGCCACG (SEQ ID NO: 1200)NM_000431.3_F_30bp_C- MVK RESCUE ACA C/9 T268I GUGGCAUCUCUUGAGGUC 13.775127B flip_guide_30_9 AGGAGGGGGGCCA (SEQ ID NO: 1201)NM_000551.3_F_30bp_C- VHL RESCUE CCG C/5 P158L GUUUCCGAGUAUACACUG 21.328127B flip_guide_30_5 GCAGUGUGAUAUU (SEQ ID NO: 1202)NM_000551.3_F_30bp_C- VHL RESCUE CCG C/7 P158L GUCUUUCCGAGUAUACAC 22.466127B flip_guide_30_7 UGGCAGUGUGAUA (SEQ ID NO: 1203)NM_000551.3_F_30bp_C- VHL RESCUE CCG C/9 P158L GGCUCUUUCCGAGUAUAC 20.534127B flip_guide_30_9 ACUGGCAGUGUGA (SEQ ID NO: 1204)NM_001256850.1_F_30bp_C- TTN RESCUE ACG C/5 R30071C GCCACCUGGGCCAGGGGA6.388 127B flip_guide_30_5 AUCAAGCACUUUG (SEQ ID NO: 1205)NM_001256850.1_F_30bp_C- TTN RESCUE ACG C/7 R30071C GUUCCACCUGGGCCAGGG13.115 127B flip_guide_30_7 GAAUCAAGCACUU (SEQ ID NO: 1206)NM_001256850.1_F_30bp_C- TTN RESCUE ACG C/9 R30071C GCUUUCCACCUGGGCCAG23.207 127B flip_guide_30_9 GGGAAUCAAGCAC (SEQ ID NO: 1207)NM_002397.4_F_30bp_C- MEF2C RESCUE ACG C/5 T1M GCCCCCUAGUCCCCGUUU 0.769127B flip_guide_30_5 UUCUUCUCUCUCU (SEQ ID NO: 1208)NM_002397.4_F_30bp_C- MEF2C RESCUE ACG C/7 T1M GCUCCCCCUAGUCCCCGU 1.270127B flip_guide_30_7 UUUUCUUCUCUCU (SEQ ID NO: 1209)NM_002397.4_F_30bp_C- MEF2C RESCUE ACG C/9 T1M GUUCUCCCCCUAGUCCCC 2.313127B flip_guide_30_9 GUUUUUCUUCUCU (SEQ ID NO: 1210)NM_002474.2_F_30bp_C- MYH11 RESCUE CCG C/5 P1264L GCUGCCGCUCCUGCACCU18.278 127B flip_guide_30_5 GCGCCUCCAGCUU (SEQ ID NO: 1211)NM_002474.2_F_30bp_C- MYH11 RESCUE CCG C/7 P1264L GGACUGCCGCUCCUGCAC25.663 127B flip_guide_30_7 CUGCGCCUCCAGC (SEQ ID NO: 1212)NM_002474.2_F_30bp_C- MYH11 RESCUE CCG C/9 P1264L GUGGACUGCCGCUCCUGC34.770 127B flip_guide_30_9 ACCUGCGCCUCCA (SEQ ID NO: 1213)NM_002834.4_F_30bp_C- PTPN11 RESCUE CCU C/5 L285F GUCAACGGGCAGGAUGUU0.520 127B flip_guide_30_5 UUUAUAUCUAUUU (SEQ ID NO: 1214)NM_002834.4_F_30bp_C- PTPN11 RESCUE CCU C/7 L285F GGAUCAACGGGCAGGAUG0.562 127B flip_guide_30_7 UUUUUAUAUCUAU (SEQ ID NO: 1215)NM_002834.4_F_30bp_C- PTPN11 RESCUE CCU C/9 L285F GAUGAUCAACGGGCAGGA0.711 127B flip_guide_30_9 UGUUUUUAUAUCU (SEQ ID NO: 1216)NM_004004.5_F_30bp_C- GJB2 RESCUE ACG C/5 R77W GGCCCCUAGCCGGAUGUG 5.347127B flip_guide_30_5 GGAGAUGGGGAAG (SEQ ID NO: 1217)NM_004004.5_F_30bp_C- GJB2 RESCUE ACG C/7 R77W GGGGCCCCUAGCCGGAUG 4.763127B flip_guide_30_7 UGGGAGAUGGGGA (SEQ ID NO: 1218)NM_004004.5_F_30bp_C- GJB2 RESCUE ACG C/9 R77W GCAGGGCCCCUAGCCGGA 5.188127B flip_guide_30_9 UGUGGGAGAUGGG (SEQ ID NO: 1219)NM_004572.3_F_30bp_C- PKP2 RESCUE CCG C/5 R796C GUAACCGGCAGAGGCUGU12.583 127B flip_guide_30_5 AGUUUCAAUGAGA (SEQ ID NO: 1220)NM_004572.3_F_30bp_C- PKP2 RESCUE CCG C/7 R796C GUGUAACCGGCAGAGGCU13.950 127B flip_guide_30_7 GUAGUUUCAAUGA (SEQ ID NO: 1221)NM_004572.3_F_30bp_C- PKP2 RESCUE CCG C/9 R796C GUGUGUAACCGGCAGAGG20.024 127B flip_guide_30_9 CUGUAGUUUCAAU (SEQ ID NO: 1222)NM_005609.3_F_30bp_C- PYGM RESCUE GCG C/5 R798W GGUCCCCUCUCUUGGGUU 4.495127B flip_guide_30_5 CUUGUACAAGGCG (SEQ ID NO: 1223)NM_005609.3_F_30bp_C- PYGM RESCUE GCG C/7 R798W GGCGUCCCCUCUCUUGGG 2.128127B flip_guide_30_7 UUCUUGUACAAGG (SEQ ID NO: 1224)NM_005609.3_F_30bp_C- PYGM RESCUE GCG C/9 R798W GCCGCGUCCCCUCUCUUG 5.946127B flip_guide_30_9 GGUUCUUGUACAA (SEQ ID NO: 1225)NM_005633.3_F_30bp_C- SOS1 RESCUE ACG C/5 T269M GUGUCCUUUCUACUGUAU12.149 127B flip_guide_30_5 CUUCUAUAUGGCC (SEQ ID NO: 1226)NM_005633.3_F_30bp_C- SOS1 RESCUE ACG C/7 T269M GUCUGUCCUUUCUACUGU36.068 127B flip_guide_30_7 AUCUUCUAUAUGG (SEQ ID NO: 1227)NM_005633.3_F_30bp_C- SOS1 RESCUE ACG C/9 T269M GCAUCUGUCCUUUCUACU39.582 127B flip_guide_30_9 GUAUCUUCUAUAU (SEQ ID NO: 1228)NM_014139.2_F_30bp_C- SCN11A RESCUE CCG C/5 P396L GAGCCCGGGUUAAGUUAA22.989 127B flip_guide_30_5 UCAGGUAGAAGGA (SEQ ID NO: 1229)NM_014139.2_F_30bp_C- SCN11A RESCUE CCG C/7 P396L GACAGCCCGGGUUAAGUU16.783 127B flip_guide_30_7 AAUCAGGUAGAAG (SEQ ID NO: 1230)NM_014139.2_F_30bp_C- SCN11A RESCUE CCG C/9 P396L GCAACAGCCCGGGUUAAG32.565 127B flip_guide_30_9 UUAAUCAGGUAGA (SEQ ID NO: 1231)NM_014874.3_F_30bp_C- MFN2 RESCUE CCG C/5 P76L GGUCCCGAACCUGUUCUU 7.822127B flip_guide_30_5 CUGUGGUAACGGG (SEQ ID NO: 1232)NM_014874.3_F_30bp_C- MFN2 RESCUE CCG C/7 P76L GACGUCCCGAACCUGUUC 9.585127B flip_guide_30_7 UUCUGUGGUAACG (SEQ ID NO: 1233)NM_014874.3_F_30bp_C- MFN2 RESCUE CCG C/9 P76L GUGACGUCCCGAACCUGU 20.769127B flip_guide_30_9 UCUUCUGUGGUAA (SEQ ID NO: 1234)NM_015559.2_F_30bp_C- SETBP1 RESCUE ACU C/5 T871I GCCCACUGCCGCUGUCGC4.665 127B flip_guide_30_5 UGGGGAUCGUCUC (SEQ ID NO: 1235)NM_015559.2_F_30bp_C- SETBP1 RESCUE ACU C/7 T871I GGUCCCACUGCCGCUGUC20.136 127B flip_guide_30_7 GCUGGGGAUCGUC (SEQ ID NO: 1236)NM_015559.2_F_30bp_C- SETBP1 RESCUE ACU C/9 T871I GCUGUCCCACUGCCGCUG9.130 127B flip_guide_30_9 UCGCUGGGGAUCG (SEQ ID NO: 1237)NM_020630.4_F_30bp_C- RET RESCUE CCG C/5 R620C GUCGCCGAAGCACUUCUC 15.762127B flip_guide_30_5 CUCCUCAGGGAAG (SEQ ID NO: 1238)NM_020630.4_F_30bp_C- RET RESCUE CCG C/7 R620C GGCUCGCCGAAGCACUUC 12.758127B flip_guide_30_7 UCCUCCUCAGGGA (SEQ ID NO: 1239)NM_020630.4_F_30bp_C- RET RESCUE CCG C/9 R620C GGGGCUCGCCGAAGCACU 15.652127B flip_guide_30_9 UCUCCUCCUCAGG (SEQ ID NO: 1240)ApoE4 rs429358 C flip 30 APOE RESCUE GCG C/30 C130R gccacguccuccaugucc0.040 128  gcgcccagccggg (SEQ ID NO: 1241) ApoE4 rs429358 C flip 28 APOERESCUE GCG C/28 C130R ggcccacguccuccaugu 0.199 128  ccgcgcccagccg (SEQID NO: 1242) ApoE4 rs429358 C flip 26 APOE RESCUE GCG C/26 C130Rgccgcccacguccuccau 0.534 128  guccgcgcccagc (SEQ ID NO: 1243)ApoE4 rs429358 C flip 24 APOE RESCUE GCG C/24 C130R gggccgcccacguccucc0.601 128  auguccgcgccca (SEQ ID NO: 1244) ApoE4 rs429358 C flip 22 APOERESCUE GCG C/22 C130R ggcggccgcccacguccu 0.305 128  ccauguccgcgcc (SEQID NO: 1245) ApoE4 rs429358 C flip 20 APOE RESCUE GCG C/20 C130Rgaggcggccgcccacguc 0.251 128  cuccauguccgcg (SEQ ID NO: 1246)ApoE4 rs429358 C flip 18 APOE RESCUE GCG C/18 C130R gccaggcggccgcccacg0.098 128  uccuccauguccg (SEQ ID NO: 1247) ApoE4 rs429358 C flip 16 APOERESCUE GCG C/16 C130R gcaccaggcggccgccca 0.066 128  cguccuccauguc (SEQID NO: 1248) ApoE4 rs429358 U flip 30 APOE RESCUE GCG U/30 C130Rgucacguccuccaugucc 0.037 128  gcgcccagccggg (SEQ ID NO: 1249)ApoE4 rs429358 U flip 28 APOE RESCUE GCG U/28 C130R ggcucacguccuccaugu0.831 128  ccgcgcccagccg (SEQ ID NO: 1250) ApoE4 rs429358 U flip 26 APOERESCUE GCG U/26 C130R gccgcucacguccuccau 0.787 128  guccgcgcccagc (SEQID NO: 1251) ApoE4 rs429358 U flip 24 APOE RESCUE GCG U/24 C130Rgggccgcucacguccucc 5.382 128  auguccgcgccca (SEQ ID NO: 1252)ApoE4 rs429358 U flip 22 APOE RESCUE GCG U/22 C130R ggcggccgcucacguccu0.913 128  ccauguccgcgcc (SEQ ID NO: 1253) ApoE4 rs429358 U flip 20 APOERESCUE GCG U/20 C130R gaggcggccgcucacguc 0.654 128  cuccauguccgcg (SEQID NO: 1254) ApoE4 rs429358 U flip 18 APOE RESCUE GCG U/18 C130Rgccaggcggccgcucacg 0.281 128  uccuccauguccg (SEQ ID NO: 1255)ApoE4 rs429358 U flip 16 APOE RESCUE GCG U/16 C130R gcaccaggcggccgcuca0.236 128  cguccuccauguc (SEQ ID NO: 1256) ApoE4 rs7412 C flip 30 APOERESCUE GCG C/30 C176R gccuucugcaggucaucg 0.079 128  gcaucgcggagga (SEQID NO: 1257) ApoE4 rs7412 C flip 28 APOE RESCUE GCG C/28 C176Rggcccuucugcaggucau 2.900 128  cggcaucgcggag (SEQ ID NO: 1258)ApoE4 rs7412 C flip 26 APOE RESCUE GCG C/26 C176R gaggcccuucugcagguc0.908 128  aucggcaucgcgg (SEQ ID NO: 1259) ApoE4 rs7412 C flip 24 APOERESCUE GCG C/24 C176R gccaggcccuucugcagg 2.397 128  ucaucggcaucgc (SEQID NO: 1260) ApoE4 rs7412 C flip 22 APOE RESCUE GCG C/22 C176Rgugccaggcccuucugca 3.436 128  ggucaucggcauc (SEQ ID NO: 1261)ApoE4 rs7412 C flip 20 APOE RESCUE GCG C/20 C176R gacugccaggcccuucug5.725 128  caggucaucggca (SEQ ID NO: 1262) ApoE4 rs7412 C flip 18 APOERESCUE GCG C/18 C176R gacacugccaggcccuuc 2.987 128  ugcaggucaucgg (SEQID NO: 1263) ApoE4 rs7412 C flip 16 APOE RESCUE GCG C/16 C176Rgguacacugccaggcccu 0.407 128  ucugcaggucauc (SEQ ID NO: 1264)ApoE4 rs7412 U flip 30 APOE RESCUE GCG U/30 C176R gucuucugcaggucaucg0.125 128  gcaucgcggagga (SEQ ID NO: 1265) ApoE4 rs7412 U flip 28 APOERESCUE GCG U/28 C176R ggcucuucugcaggucau 3.633 128  cggcaucgcggag (SEQID NO: 1266) ApoE4 rs7412 U flip 26 APOE RESCUE GCG U/26 C176Rgaggcucuucugcagguc 1.087 128  aucggcaucgcgg (SEQ ID NO: 1267)ApoE4 rs7412 U flip 24 APOE RESCUE GCG U/24 C176R gccaggcucuucugcagg3.305 128  ucaucggcaucgc (SEQ ID NO: 1268) ApoE4 rs7412 U flip 22 APOERESCUE GCG U/22 C176R gugccaggcucuucugca 6.810 128  ggucaucggcauc (SEQID NO: 1269) ApoE4 rs7412 U flip 20 APOE RESCUE GCG U/20 C176Rgacugccaggcucuucug 10.902 128  caggucaucggca (SEQ ID NO: 1270)ApoE4 rs7412 U flip 18 APOE RESCUE GCG U/18 C176R gacacugccaggcucuuc9.357 128  ugcaggucaucgg (SEQ ID NO: 1271) ApoE4 rs7412 U flip 16 APOERESCUE GCG U/16 C176R gguacacugccaggcucu 0.643 128  ucugcaggucauc (SEQID NO: 1272)

TABLE 32 Mammalian plasmids and maps Plasmid Description Benchling linkpC0043 PspCas13b crRNA benchling.com/s/s eq- backboneOH6nMnZCZn930BWqcFNa pC0076 CMV-dRanCas13b-mapkNES- benchling.com/s/seq- GS-dADAR2 E488Q BulRvsrtwP4aEJtTqYM2 pC0077 pCMV-dRanCas13b-mapkNES-benchling.com/s/s eq- GS-dADAR2(E488Q/V351G/ gQ13PMPLkcO6OceAfmpCS486A/T375S/S370C/P462A/ N597I/L332I/I398V) r8 pC0078pCMV-dRanCas13b-mapkNES- benchling.com/s/s eq- GS-dADAR2(E488Q/V351G/19Ytwwh0o0vSlbyXYZ95 S486A/T375S/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T) r16 pC0079pCMV-dRanCas13b-mapkNES- benchling.com/s/s eq- GS-dADAR2(E488Q/V351G/WX6VnavLS6JaaZ54XAOx S486A/T375A/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T) RESCUE-S pC0080pCMV-dCas13b12-HIVNES- benchling.com/s/s eq- GS-dADAR2(E488Q/V351G/GQqPCRE9I6KnEfHksQem S486A/T375S/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T) RESCUE pC0081pCMV-dCas13b12-HIVNES- benchling.com/s/s eq- GS-dADAR2(E488Q/V351G/qjbEAXZgupeRXBa8abls S486A/T375S/S370C/P462A/ N597I/L332I/I398V/K350I/M383L/D619G/S582T/V440I/ S495N/K418E/S661T/S375A) RESCUE-S pC0082CMV-Cluciferase-polyA EF1a- benchling.com/s/s eq-G-luciferase(C82R)-polyA C Qjsg3Yx0r1Hs77GT58BI to U reporter TCG motifpC0083 CMV-Cluciferase-polyA EF1a- benchling.com/s/s eq-G-luciferase(C82R)-polyA C Z8zwu3LdetcuYHAFGnpe to U reporter GCG motifpC0084 CMV-Cluciferase-polyA EF1a- benchling.com/s/s eq-G-luciferase(C82R)-polyA C G2Iag6I8NBQAXqbJnou5 to U reporter ACG motifpC0085 CMV-Cluciferase-polyA EF1a- benchling.com/s/s eq-G-luciferase(C82R)-polyA C alkwhNUsFTg80TVmpquP to U reporter CCG motifpC0086 CMV-Cluciferase-polyA EF1a- benchling.com/s/s eq-G-luciferase(L77P)-polyA C 1J8Fm6vtF7GZS676Q7p to U reporter CCA motifpC0087 CMV-Cluciferase-polyA EF1a- benchling.com/s/s eq-G-luciferase(L77P)-polyA C 5MMokwvxoAjq6ML2sjjZ to U reporter CCT motifpC0088 pCMV-ADAR2dd(E488Q/ benchling.com/s/s eq- V351G/S486A/T375S/YISAybq2YnuclVwYDy95 S370C/P462A/N597I/ L332I/I398V/K350I/M383L/D619G/S582T/ V440I/S495N/K418E/ S661T) r16 pC0089 pCMV-ADAR2 fulllength benchling.com/s/s eq- (E488Q/V351G/S486A/T375S/95ZpoHj9GhQFzIu3m6cb S370C/P462A/N597I/L332I/ I398V/K350I/M383L/D619G/S582T/V440I/S495N/K418E/ S661T) r16 pC0090 Beta catenin reporter M50benchling.com/s/s eq- Super 8x (TCF/LEF binding jPxZnxs3wSeKZhgTTDBusites) TOPFlash with Gluc/Cluc pC0091 Beta catenin reporterbenchling.com/s/s eq- control M51 Super 8x 130b6c9baCfw8R3lTgs R(mutated TCF/LEF binding sites) FOPFlash with Gluc/Cluc

TABLE 33 Yeast plasmids and maps Description Benchling linkpGAL-dRanCas13b-GS- benchling.com/s/seq- dADAR2 [RESCUEr0 Yeast]w1l2aOHR2gSe4P2aQ7VY pGAL-dRanCas13b-GS- benchling.com/s/seq-dADAR2(V351/S486A/T375S) saQngvNf6i3GhSGF0H3I [RESCUEr3 Yeast]pGAL-dRanCas13b-GS- benchling.com/s/seq- dADAR2(V351G/S486A/T375S/GIJ7BnpV3Vd3XtKiIxdm S370C/P462A/L332I) [RESCUEr7 Yeast] pYES3/CTpADH1-HH-Targeting- benchling.com/s/seq- RanCas13b_DR-- HDV-space-Xs2ffVMn4FwwQ79zDDEo ADH1_terminator His (P196L) [Yeast target HisP196L] pYES3/CT pADH1-HH- benchling.com/s/seq- Golden-gate-BsmBi-UM9NjG7JKK0GFe9MowGo BsmbI_RanCas13b_DR--HDV-space- ADH1_terminator His(P196L) [Yeast target His P196L NT] pYES3/CT pADH1-HH-benchling.com/s/seq- Guide-RanCas13b_DR--HDV-space- EefJI5brqll3fm0B5Qc5ADH1_terminator His S129P [Yeast target His S129P] pYES3/CT pADH1-HH-benchling.com/s/seq- Golden-gate-BsmBi- bt7gOlrp8OuOoV3YJWZGBsmbI_RanCas13b_DR--HDV-space- ADH1_terminator His Motifs S129P [Yeasttarget His S129P NT] pYES3/CT pADH1-HH- benchling.com/s/seq-Y66H-targeting-RanCas13b-DR-HDV- hiMELqTYPT9y0nOAKEq2 ADH1-termATG-yeGFP Y66H [Yeast target GFP Y66H] pYES3/CT pADH1-HH-benchling.com/s/seq- Golden-gate-BsmBi- OCWlvnjeKYwSbG8GELTQBsmbI_-RanCas13b_DR-HDV- ADH1-term ATG-yeGFP Y66H Reporter [Yeast targetGFP Y66H NT] pYES3/CT pADH1-HH-Targeting- benchling.com/s/seq-RanCas13b_DR-- HDV-space- ziOgQXpXGZwot9NDkFJf ADH1_terminator His(S22P) [Yeast target 30/26 His S22P] pYES3/CT pADH1-HH-Targeting-benchling.com/s/seq- RanCas13b_DR-- HDV-space- Ni9S7NsmGwWEYQM7K1EFADH1_terminator His (S22P) [Yeast target 30/24 His S22P] pYES3/CTpADH1-HH-Targeting- benchling.com/s/seq- RanCas13b_DR-- HDV-space-yW539UdpUtm9kZbLafaJ ADH1_terminator His (S22P) [Yeast target 30/22 HisS22P] pYES3/CT pADH1-HH-Targeting- benchling.com/s/seq- RanCas13b_DR--HDV-space- z37Sri5Pds8UofSHRtGe ADH1_terminator His (S22P) [Yeast target30/20 His S22P] pYES3/CT pADH1-HH- benchling.com/s/seq-Golden-gate-BsmBi- 6HoWi69XrcLL4nW2ya0V BsmbI_RanCas13b_DR--HDV-space-ADH1_terminator His (S22P) [Yeast target His S22P NT]

TABLE 34 Guide sequences used for yeast targeting Base flip/spacerTargeted length/ Codon First Name gene Motif position changeSpacer sequence figure His Y66H EGFP UCA U/50/34 Y66H aaacauugaacacc113A targeting auuaguuaaaguag ugacuaagguuggc cauggaac (SEQ ID NO: 1273)His L196P HIS CCU U/50/34 L196P ucuuauggcaaccg 113C targetingcaugagccuugaac gcacucucacuacg gugaugau (SEQ ID NO: 1274) His S129P HISUCC C/30/26 S129P gcuugcaagugccu 113D targeting cauccaaaggcgcaaau (SEQ ID NO: 1275) His S22P HIS UCC U/30/26 S22P aauguaaucgcaau 113Etargeting 30/26 cugaaucuugguuu ca (SEQ ID NO: 1276) His S22P HIS UCCU/30/24 S22P uuaauguaaucgca 113E targeting 30/24 aucugaaucuugguuu (SEQ ID NO: 1277) His S22P HIS UCC U/30/22 S22P cuuuaauguaaucg 113Etargeting 30/22 caaucugaaucuug gu (SEQ ID NO: 1278) His S22P HIS UCCU/30/20 S22P cccuuuaauguaau 113E targeting 30/20 cgcaaucugaaucuug (SEQ ID NO: 1279)

REFERENCES

-   1. O. O. Abudayyeh et al., C2c2 is a single-component programmable    RNA-guided RNA-targeting CRISPR effector. Science 353, aaf5573    (2016).-   2. C. Cassidy-Amstutz et al., Identification of a Minimal Peptide    Tag for in Vivo and in Vitro Loading of Encapsulin. Biochemistry 55,    3461-3468 (2016).-   3. S. Shmakov et al., Discovery and Functional Characterization of    Diverse Class 2 CRISPR-Cas Systems. Mol Cell 60, 385-397 (2015).-   4. A. A. Smargon et al., Cas13b Is a Type VI-B CRISPR-Associated    RNA-Guided RNase Differentially Regulated by Accessory Proteins    Csx27 and Csx28. Mol Cell 65, 618-630 e617 (2017).-   5. A. East-Seletsky et al., Two distinct RNase activities of    CRISPR-C2c2 enable guide-RNA processing and RNA detection. Nature    538, 270-273 (2016).-   6. O. O. Abudayyeh et al., RNA targeting with CRISPR-Cas13. Nature    550, 280-284 (2017).-   7. D. B. T. Cox et al., RNA editing with CRISPR-Cas13. Science 358,    1019-1027 (2017).-   8. T. Merkle et al., Precise RNA editing by recruiting endogenous    ADARs with anti sense oligonucleotides. Nat Biotechnol 37, 133-138    (2019).-   9. P. Vogel et al., Efficient and precise editing of endogenous    transcripts with SNAP-tagged ADARs. Nat Methods 15, 535-538 (2018).-   10. M. Fukuda et al., Construction of a guide-RNA for site-directed    RNA mutagenesis utilising intracellular A-to-I RNA editing. Sci Rep    7, 41478 (2017).-   11. J. Wettengel, P. Reautschnig, S. Geisler, P. J. Kahle, T.    Stafforst, Harnessing human ADAR2 for RNA repair—Recoding a PINK1    mutation rescues mitophagy. Nucleic Acids Res 45, 2797-2808 (2017).-   12. M. F. Montiel-Gonzalez, I. C. Vallecillo-Viejo, J. J. Rosenthal,    An efficient system for selectively altering genetic information    within mRNAs. Nucleic Acids Res 44, e157 (2016).-   13. P. Vogel, M. F. Schneider, J. Wettengel, T. Stafforst, Improving    site-directed RNA editing in vitro and in cell culture by chemical    modification of the guideRNA. Angew Chem Int Ed Engl 53, 6267-6271    (2014).-   14. M. F. Montiel-Gonzalez, I. Vallecillo-Viejo, G. A.    Yudowski, J. J. Rosenthal, Correction of mutations within the cystic    fibrosis transmembrane conductance regulator by site-directed RNA    editing. Proc Natl Acad Sci USA 110, 18285-18290 (2013).-   15. H. A. Rees, D. R. Liu, Base editing: precision chemistry on the    genome and transcriptome of living cells. Nat Rev Genet 19, 770-788    (2018).-   16. A. C. Komor, Y. B. Kim, M. S. Packer, J. A. Zuris, D. R. Liu,    Programmable editing of a target base in genomic DNA without    double-stranded DNA cleavage. Nature 533, 420-424 (2016).-   17. K. Nishida et al., Targeted nucleotide editing using hybrid    prokaryotic and vertebrate adaptive immune systems. Science 353,    (2016).-   18. J. D. Salter, R. P. Bennett, H. C. Smith, The APOBEC Protein    Family: United by Structure, Divergent in Function. Trends Biochem    Sci 41, 578-594 (2016).-   19. S. Jin et al., Cytosine, but not adenine, base editors induce    genome-wide off-target mutations in rice. Science, (2019).-   20. E. Zuo et al., Cytosine base editor generates substantial    off-target single-nucleotide variants in mouse embryos. Science,    (2019).-   21. J. Grunewald et al., Transcriptome-wide off-target RNA editing    induced by CRISPR-guided DNA base editors. Nature, (2019).-   22. M. R. Macbeth et al., Inositol hexakisphosphate is bound in the    ADAR2 core and required for RNA editing. Science 309, 1534-1539    (2005).-   23. M. M. Matthews et al., Structures of human ADAR2 bound to dsRNA    reveal base-flipping mechanism and basis for site selectivity.    Nature structural & molecular biology 23, 426-433 (2016).-   24. D. Katrekar et al., In vivo RNA editing of point mutations via    RNA-guided adenosine deaminases. Nat Methods 16, 239-242 (2019).-   25. B. T. MacDonald, K. Tamai, X. He, Wnt/beta-catenin signaling:    components, mechanisms, and diseases. Dev Cell 17, 9-26 (2009).-   26. M. K. Chee, S. B. Haase, New and Redesigned pRS Plasmid Shuttle    Vectors for Genetic Manipulation of Saccharomyces cerevisiae. G3    (Bethesda) 2, 515-526 (2012).-   27. M. F. Laughery et al., New vectors for simple and streamlined    CRISPR-Cas9 genome editing in Saccharomyces cerevisiae. Yeast 32,    711-720 (2015).-   28. M. R. Macbeth, B. L. Bass, Large-scale overexpression and    purification of ADARs from Saccharomyces cerevisiae for biophysical    and biochemical studies. Methods Enzymol 424, 319-331 (2007).-   29. H. Ng, N. Dean, Dramatic Improvement of CRISPR/Cas9 Editing in    Candida albicans by Increased Single Guide RNA Expression. mSphere    2, (2017).-   30. R. Heim, D. C. Prasher, R. Y. Tsien, Wavelength mutations and    posttranslational autoxidation of green fluorescent protein. Proc    Natl Acad Sci USA 91, 12501-12504 (1994).-   31. Y. Wang, P. A. Beal, Probing RNA recognition by human ADAR2    using a high-throughput mutagenesis method. Nucleic Acids Res 44,    9872-9880 (2016).-   32. R. D. Gietz, R. H. Schiestl, Large-scale high-efficiency yeast    transformation using the LiAc/SS carrier DNA/PEG method. Nat Protoc    2, 38-41 (2007).-   33. S. B. Kim, H. Suzuki, M. Sato, H. Tao, Superluminescent variants    of marine luciferases for bioassays. Anal Chem 83, 8732-8740 (2011).-   34. M. T. Veeman, D. C. Slusarski, A. Kaykas, S. H. Louie, R. T.    Moon, Zebrafish prickle, a modulator of noncanonical Wnt/Fz    signaling, regulates gastrulation movements. Curr Biol 13, 680-685    (2003).

Various modifications and variations of the described methods,pharmaceutical compositions, and kits of the invention will be apparentto those skilled in the art without departing from the scope and spiritof the invention. Although the invention has been described inconnection with specific embodiments, it will be understood that it iscapable of further modifications and that the invention as claimedshould not be unduly limited to such specific embodiments. Indeed,various modifications of the described modes for carrying out theinvention that are obvious to those skilled in the art are intended tobe within the scope of the invention. This application is intended tocover any variations, uses, or adaptations of the invention following,in general, the principles of the invention and including suchdepartures from the present disclosure come within known customarypractice within the art to which the invention pertains and may beapplied to the essential features herein before set forth.

What is claimed is:
 1. An engineered adenosine deaminase comprising oneor more mutations, wherein the engineered adenosine deaminase hascytidine deaminase activity, wherein said adenosine deaminase protein orcatalytic domain thereof comprises one or more of the mutations: E488Q,V351G, S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I, M383L,D619G, S582T, V440I, S495N, K418E, S661T based on amino acid sequencepositions of hADAR2-D, and corresponding mutations in a homologous ADARprotein.
 2. An engineered adenosine deaminase comprising one or moremutations, wherein the engineered adenosine deaminase has cytidinedeaminase activity.
 3. The engineered adenosine deaminase of claim 2,wherein the engineered adenosine deaminase has adenosine deaminaseactivity.
 4. The engineered adenosine deaminase of claim 2, wherein theengineered adenosine deaminase is a portion of a fusion protein.
 5. Theengineered adenosine deaminase of claim 2, wherein the fusion proteincomprises a functional domain.
 6. The engineered adenosine deaminase ofclaim 2, wherein the functional domain is capable of directing theengineered adenosine deaminase to bind to a target nucleic acid.
 7. Theengineered adenosine deaminase of claim 2, wherein the functional domainis a CRISPR-Cas protein of any one of claims 50 to
 55. 8. The engineeredadenosine deaminase of claim 2, wherein the CRISPR-Cas protein is a deadform CRISPR-Cas protein or CRISPR-Cas nickase protein.
 9. The engineeredadenosine deaminase of claim 2, wherein the one or more mutationscomprises: E488Q, V351G, S486A, T375S, S370C, P462A, N597I, L332I,I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418E, S661T based onamino acid sequence positions of hADAR2-D, and corresponding mutationsin a homologous ADAR protein.
 10. The engineered adenosine deaminase ofclaim 2, wherein the one or more mutations comprises: E488Q, V351G,S486A, T375S, S370C, P462A, N597I, L332I, I398V, K350I, M383L, D619G,S582T, V440I, S495N, K418E, and S661T based on amino acid sequencepositions of hADAR2-D, and corresponding mutations in a homologous ADARprotein.
 11. A polynucleotide encoding the engineered adenosinedeaminase of any one of claims above claims, or a catalytic domainthereof.
 12. A vector comprising the polynucleotide of claim
 11. 13. Apharmaceutical composition comprising the engineered adenosine deaminaseof any one of claims 1-10 or a catalytic domain thereof formulated fordelivery by liposomes, nanoparticles, exosomes, microvesicles, nucleicacid nanoassemblies, a gene gun, or an implantable device.
 14. Anengineered cell expressing the engineered adenosine deaminase of any oneof claims 1-10 or a catalytic domain thereof.
 15. The engineered cell ofclaim 14, wherein the cell transiently expresses the engineeredadenosine deaminase or the catalytic domain thereof.
 16. The engineeredcell of claim 15, wherein the cell non-transiently expresses theengineered adenosine deaminase or the catalytic domain thereof.
 17. Anengineered, non-naturally occurring system for modifying nucleotides ina target nucleic acid, comprising a) a dead CRISPR-Cas or CRISPR-Casnickase protein, or a nucleotide sequence encoding said dead Cas or Casnickase protein; b) a guide molecule comprising a guide sequence thathybridizes to a target sequence and designed to form a complex with thedead CRISPR-Cas or CRISPR-Cas nickase protein; and c) a nucleotidedeaminase protein or catalytic domain thereof, or a nucleotide sequenceencoding said nucleotide deaminase protein or catalytic domain thereof,wherein said nucleotide deaminase protein or catalytic domain thereof iscovalently or non-covalently linked to said dead CRISPR-Cas orCRISPR-Cas nickase protein or said guide molecule is adapted to linkthereof after delivery.
 18. The system of claim 17, wherein saidadenosine deaminase protein or catalytic domain thereof comprises one ormore of the mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I,L332I, I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418E, S661Tbased on amino acid sequence positions of hADAR2-D, and correspondingmutations in a homologous ADAR protein.
 19. The system of claim 17,wherein said adenosine deaminase protein or catalytic domain thereofcomprises mutations: E488Q, V351G, S486A, T375S, S370C, P462A, N597I,L332I, I398V, K350I, M383L, D619G, S582T, V440I, S495N, K418E, and S661Tbased on amino acid sequence positions of hADAR2-D, and correspondingmutations in a homologous ADAR protein.
 20. The system of claim 17,wherein the CRISPR-Cas protein is Cas9, Cas12, Cas13, Cas 14, CasX, orCasY.
 21. The system of claim 17, wherein the CRISPR-Cas protein isCas13b.
 22. The system of claim 17, wherein the CRISPR-Cas protein isCas13b-t1, Cas13b-t2, or Cas13b-t3.
 23. The system of claim 17, whereinthe CRISPR-Cas is an engineered CRISPR-Cas protein of any one of claims50 to
 367. 24. A method for modifying nucleotide in a target nucleicacid, comprising: delivering to said target nucleic acid the engineeredadenosine deaminase of any one of claims 1-10, or the system of any oneof claims 17-23, wherein the deaminase deaminates a nucleotide at one ormore target loci on the target nucleic acid.
 25. The method of claim 24,wherein said nucleotide deaminase protein or catalytic domain thereofhas been modified to increase activity against a DNA-RNA heteroduplex.26. The method of claim 24, wherein said nucleotide deaminase protein orcatalytic domain thereof has been modified to reduce off-target effects.27. The method of claim 24, wherein the target nucleic acid is within acell.
 28. The method of claim 24, wherein said cell is a eukaryoticcell.
 29. The method of claim 24, wherein said cell is a non-humananimal cell.
 30. The method of claim 24, wherein said cell is a humancell.
 31. The method of claim 24, wherein said cell is a plant cell. 32.The method of claim 24, wherein said target nucleic acid is within ananimal.
 33. The method of claim 24, wherein said target nucleic acid iswithin a plant.
 34. The method of claim 24, wherein said target nucleicacid is comprised in a DNA molecule in vitro.
 35. The method of claim24, wherein the engineered adenosine deaminase, or one or morecomponents of the system are delivered to the cell as aribonucleoprotein complex.
 36. The method of claim 24, wherein theengineered adenosine deaminase, or one or more components of the systemare delivered via one or more particles, one or more vesicles, or one ormore viral vectors.
 37. The method of claim 24, wherein said one or moreparticles comprise a lipid, a sugar, a metal or a protein.
 38. Themethod of claim 24, wherein said one or more particles comprise lipidnanoparticles.
 39. The method of claim 24, wherein said one or morevesicles comprise exosomes or liposomes.
 40. The method of claim 24,wherein said one or more viral vectors comprise one or more adenoviralvectors, one or more lentiviral vectors, or one or more adeno-associatedviral vectors.
 41. The method of claim 24, where said method modifies acell, a cell line or an organism by manipulation of one or more targetsequences at genomic loci of interest.
 42. The method of claim 24,wherein said deamination of said nucleotide at said target locus ofinterest remedies a disease caused by a G→A or C→T point mutation or apathogenic SNP.
 43. The method of claim 24, wherein said disease isselected from cancer, haemophilia, beta-thalassemia, Marfan syndrome andWiskott-Aldrich syndrome.
 44. The method of claim 24, wherein saiddeamination of said nucleotide at said target locus of interest remediesa disease caused by a T→C or A→G point mutation or a pathogenic SNP. 45.The method of claim 24, wherein said deamination of said nucleotide atsaid target locus of interest inactivates a target gene at said targetlocus.
 46. The method of claim 24, wherein the engineered adenosinedeaminase, or one or more components of the system are delivered byliposomes, nanoparticles, exosomes, microvesicles, nucleic acidnanoassemblies, a gene gun, an implantable device, or the vector systemof claim
 302. 47. The method of claim 24, wherein modification of thenucleotide modifies gene product encoded at the target locus orexpression of the gene product.
 48. The engineered adenosine deaminaseof any one of claims 1-10 or the system of any one of claims 17-23,wherein the adenosine protein or catalytic domain thereof comprises amutation on S375 based on amino acid sequence positions of hADAR2-D, anda corresponding mutation in a homologous ADAR protein.
 49. Theengineered adenosine deaminase or the system of claim 48, wherein themutation on S375 is S375N.
 50. An engineered CRISPR-Cas proteincomprising one or more HEPN domains and further comprising one or moremodified amino acids, wherein the amino acids: a. interact with a guideRNA that forms a complex with the engineered CRISPR-Cas protein; b. arein a HEPN active site, an inter-domain linker domain, a lid domain, ahelical domain 1, a helical domain 2, or a bridge helix domain of theengineered CRISPR-Cas protein; or c. a combination thereof.
 51. Theengineered CRISPR-Cas protein of claim 50, wherein the HEPN domaincomprises a RxxxxH motif.
 52. The engineered CRISPR-Cas protein of claim51, wherein the RxxxxH motif comprises a R{N/H/K}X₁X₂X₃H sequence. 53.The engineered CRISPR-Cas protein of claim 52, wherein: X₁ is R, S, D,E, Q, N, G, or Y, X₂ is independently I, S, T, V, or L, and X₃ isindependently L, F, N, Y, V, I, S, D, E, or A.
 54. The engineeredCRISPR-Cas protein of claim 50, wherein the CRISPR-Cas protein is a TypeVI CRISPR-Cas protein.
 55. The engineered CRISPR-Cas protein of claim54, wherein the Type VI CRISPR-Cas protein is a Cas13.
 56. Theengineered CRISPR-Cas protein of claim 55, wherein the Type VICRISPR-Cas protein is Cas13a, Cas13b, Cas13c, or Cas13d.
 57. Theengineered CRISPR-Cas protein of claim 55, comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): T405, H407, K457, H500, K570, K590,N634, R638, N652, N653, K655, S658, K741, K744, N756, S757, R762, R791,K846, K857, K870, R877, K183, K193, R600, K607, K612, R614, K617, K826,K828, K829, R824, R830, Q831, K835, K836, R838, R618, D434, K431, R53,K943, R1041, Y164, R285, R287, K292, E296, N297, Q646, N647, R402, K393,N653, N652, R482, N480, D396, E397, D398, E399, K294, E400, R56, N157,H161, H452, N455, K484, N486, G566, H567, A656, V795, A796, W842, K871,E873, R874, R1068, N1069, or H1073.
 58. The engineered CRISPR-Casprotein of claim 55, comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): H407, K457, H500, K570, K590, N634, R638, N652, N653, K655,S658, K741, K744, N756, S757, R762, R791, K846, K857, K870, R877, K183,K193, R600, K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831,K835, K836, R838, R618, D434, K431, R53, K943, R1041, Y164, R285, R287,K292, E296, N297, Q646, N647, R402, K393, N653, N652, R482, N480, D396,E397, D398, E399, K294, E400, R56, N157, H161, H452, N455, K484, N486,G566, H567, W842, K871, E873, R874, R1068, N1069, or H1073.
 59. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): T405, H407, K457, H500, K570, K590,N634, R638, N652, N653, K655, S658, K741, K744, N756, S757, R762, R791,K846, K857, K870, R877, K183, K193, R600, K607, K612, R614, K617, K826,K828, K829, R824, R830, Q831, K835, K836, R838, R618, D434, K431, R53,K943, R1041, Y164, R285, R287, K292, E296, N297, Q646, N647, R402, K393,N653, N652, R482, N480, D396, E397, D398, E399, K294, or E400.
 60. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): K393, R402, N482, T405, H407, S658,N653, A656, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873,R877, K846, R874, R762, V795, A796, R791, G566, K590, R638, H452, S757,N756, N486, K484, N480, K457, K741, R56, N157, H161, R1068, N1069, orH1073.
 61. The engineered CRISPR-Cas protein of claim 55 comprising oneor more mutation of an amino acid corresponding to the following aminoacids of PbCas13b: K393, R402, N482, H407, S658, N653, K655, N652, H567,N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791,G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, K741, R56,N157, H161, R1068, N1069, or H1073.
 62. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: W842, K846,K870, E873, or R877.
 63. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: W842, K846, K870, E873, or R877.
 64. The engineered CRISPR-Casprotein of claim 55 comprising in helical domain 1-3 one or moremutation of an amino acid corresponding to the following amino acids inhelical domain 1-3 of PbCas13b: W842, K846, K870, E873, or R877.
 65. Theengineered CRISPR-Cas protein of claim 55 comprising in the bridge helixdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in the bridge helix domain of PbCas13b: W842,K846, K870, E873, or R877.
 66. The engineered CRISPR-Cas protein ofclaim 55 comprising one or more mutation of an amino acid correspondingto the following amino acids of PbCas13b: K393, R402, N480, N482, N652,or N653.
 67. The engineered CRISPR-Cas protein of claim 55 comprisingone or more mutation of an amino acid corresponding to the followingamino acids of PbCas13b: K393, R402, N480, or N482.
 68. The engineeredCRISPR-Cas protein of claim 55 comprising in the LID domain one or moremutation of an amino acid corresponding to the following amino acids inthe LID domain of PbCas13b: K393, R402, N480, or N482.
 69. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: N652 or N653.
 70. The engineered CRISPR-Cas protein of claim55 comprising in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPbCas13b: N652 or N653.
 71. The engineered CRISPR-Cas protein of claim55 comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: T405, H407, S658, N653, A656, K655,N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, V795, A796, R791, G566, K590, R638, H452, S757, N756, N486, K484,N480, K457, or K741.
 72. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H407, S658, N653, K655, N652, H567,N455, H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791,G566, K590, R638, H452, S757, N756, N486, K484, N480, K457, or K741. 73.The engineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: S658, N653, A656, K655, N652, H567, H500, K871, K857, K870,W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590, R638,S757, N756, or K741.
 74. The engineered CRISPR-Cas protein of claim 55comprising in a helical domain one or more mutation of an amino acidcorresponding to the following amino acids in a helical domain ofPbCas13b: S658, N653, A656, K655, N652, H567, H500, K871, K857, K870,W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590, R638,S757, N756, or K741.
 75. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, K871, K857, K870, W842,E873, R877, K846, R874, R762, V795, A796, R791, G566, S757, or N756. 76.The engineered CRISPR-Cas protein of claim 55 comprising in helicaldomain 1 one or more mutation of an amino acid corresponding to thefollowing amino acids in helical domain 1 of PbCas13b: H567, H500, K871,K857, K870, W842, E873, R877, K846, R874, R762, V795, A796, R791, G566,S757, or N756.
 77. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H567, H500, R762, V795, A796, R791,G566, S757, or N756.
 78. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: H567, H500, R762, V795, A796, R791, G566, S757, or N756. 79.The engineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: K871, K857, K870, W842, E873, R877, K846, or R874.
 80. Theengineered CRISPR-Cas protein of claim 55 comprising in the bridge helixdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in the bridge helix domain of PbCas13b: K871,K857, K870, W842, E873, R877, K846, or R874.
 81. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of PbCas13b: H567,H500, or G566.
 82. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 1-2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-2 ofPbCas13b: H567, H500, or G566.
 83. The engineered CRISPR-Cas protein ofclaim 55 comprising one or more mutation of an amino acid correspondingto the following amino acids of PbCas13b: K871, K857, K870, W842, E873,R877, K846, R874, R762, V795, A796, R791, S757, or N756.
 84. Theengineered CRISPR-Cas protein of claim 55 comprising in helical domain1-3 one or more mutation of an amino acid corresponding to the followingamino acids in helical domain 1-3 of PbCas13b: K871, K857, K870, W842,E873, R877, K846, R874, R762, V795, A796, R791, S757, or N756.
 85. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: R762, V795, A796, R791, S757, or N756.
 86. The engineeredCRISPR-Cas protein of claim 55 comprising in helical domain 1-3 one ormore mutation of an amino acid corresponding to the following aminoacids in helical domain 1-3 of PbCas13b: R762, V795, A796, R791, S757,or N756.
 87. The engineered CRISPR-Cas protein of claim 55 comprisingone or more mutation of an amino acid corresponding to the followingamino acids of PbCas13b: S658, N653, A656, K655, N652, K590, R638, orK741.
 88. The engineered CRISPR-Cas protein of claim 55 comprising inhelical domain 2 one or more mutation of an amino acid corresponding tothe following amino acids in helical domain 2 of PbCas13b: S658, N653,A656, K655, N652, K590, R638, or K741.
 89. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: T405, H407,N486, K484, N480, H452, N455, or K457.
 90. The engineered CRISPR-Casprotein of claim 55 comprising in the LID domain one or more mutation ofan amino acid corresponding to the following amino acids in the LIDdomain of PbCas13b: T405, H407, N486, K484, N480, H452, N455, or K457.91. The engineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: S658, N653, K655, N652, H567, H500, K871, K857, K870, W842,E873, R877, K846, R874, R762, R791, G566, K590, R638, S757, N756, orK741.
 92. The engineered CRISPR-Cas protein of claim 55 comprising in ahelical domain one or more mutation of an amino acid corresponding tothe following amino acids in a helical domain of PbCas13b: S658, N653,K655, N652, H567, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, R791, G566, K590, R638, S757, N756, or K741.
 93. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of PbCas13b: H567,H500, K871, K857, K870, W842, E873, R877, K846, R874, R762, R791, G566,S757, or N756.
 94. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1 ofPbCas13b: H567, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, R791, G566, S757, or N756.
 95. The engineered CRISPR-Cas proteinof claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: H567, H500,R762, R791, G566, S757, or N756.
 96. The engineered CRISPR-Cas proteinof claim 55 comprising in helical domain 1 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1 of PbCas13b: H567, H500, R762, R791, G566, S757, or N756.
 97. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: K871, K857, K870, W842, E873, R877, K846, R874, R762, R791,S757, or N756.
 98. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPbCas13b: K871, K857, K870, W842, E873, R877, K846, R874, R762, R791,S757, or N756.
 99. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: R762, R791, S757, or N756.
 100. Theengineered CRISPR-Cas protein of claim 55 comprising in helical domain1-3 one or more mutation of an amino acid corresponding to the followingamino acids in helical domain 1-3 of PbCas13b: R762, R791, S757, orN756.
 101. The engineered CRISPR-Cas protein of claim 55 comprising oneor more mutation of an amino acid corresponding to the following aminoacids of PbCas13b: S658, N653, K655, N652, K590, R638, or K741.
 102. Theengineered CRISPR-Cas protein of claim 55 comprising in helical domain 2one or more mutation of an amino acid corresponding to the followingamino acids in helical domain 2 of PbCas13b: S658, N653, K655, N652,K590, R638, or K741.
 103. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: H407, N486, K484, N480, H452, N455,or K457.
 104. The engineered CRISPR-Cas protein of claim 55 comprisingin the LID domain one or more mutation of an amino acid corresponding tothe following amino acids in the LID domain of PbCas13b: H407, N486,K484, N480, H452, N455, or K457.
 105. The engineered CRISPR-Cas proteinof claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: R56, N157, H161,R1068, N1069, or H1073.
 106. The engineered CRISPR-Cas protein of claim55 comprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain of PbCas13b:R56, N157, H161, R1068, N1069, or H1073.
 107. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: R56, N157, orH161.
 108. The engineered CRISPR-Cas protein of claim 55 comprising inHEPN domain 1 one or more mutation of an amino acid corresponding to thefollowing amino acids in HEPN domain 1 of PbCas13b: R56, N157, or H161.109. The engineered CRISPR-Cas protein of claim 55 comprising one ormore mutation of an amino acid corresponding to the following aminoacids of PbCas13b: R1068, N1069, or H1073.
 110. The engineeredCRISPR-Cas protein of claim 55 comprising in HEPN domain 2 one or moremutation of an amino acid corresponding to the following amino acids inHEPN domain 2 of PbCas13b: R1068, N1069, or H1073.
 111. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of PbCas13b: K393,R402, N482, T405, H407, N486, K484, N480, H452, N455, or K457.
 112. Theengineered CRISPR-Cas protein of claim 55 comprising in the LID domainone or more mutation of an amino acid corresponding to the followingamino acids in the LID domain of PbCas13b: K393, R402, N482, T405, H407,N486, K484, N480, H452, N455, or K457.
 113. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: K393, R402,N482, H407, N486, K484, N480, H452, N455, or K457.
 114. The engineeredCRISPR-Cas protein of claim 55 comprising in the LID domain one or moremutation of an amino acid corresponding to the following amino acids inthe LID domain of PbCas13b: K393, R402, N482, H407, N486, K484, N480,H452, N455, or K457.
 115. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: T405, H407, S658, N653, A656, K655,N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846, R874,R762, V795, A796, R791, G566, K590, R638, H452, S757, N756, N486, K484,N480, K457, K741, K393, R402, or N482.
 116. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: H407, S658,N653, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, R791, G566, K590, R638, H452, S757, N756, N486, K484,N480, K457, K741, K393, R402, or N482.
 117. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: S658, N653,A656, K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877,K846, R874, R762, V795, A796, R791, G566, K590, R638, H452, S757, N756,N486, K484, N480, K457, or K741.
 118. The engineered CRISPR-Cas proteinof claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of PbCas13b: S658, N653,K655, N652, H567, N455, H500, K871, K857, K870, W842, E873, R877, K846,R874, R762, R791, G566, K590, R638, H452, S757, N756, N486, K484, N480,K457, or K741.
 119. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of PbCas13b: N486, K484, N480, H452, N455, orK457.
 120. The engineered CRISPR-Cas protein of claim 55 comprising inthe LID domain one or more mutation of an amino acid corresponding tothe following amino acids in the LID domain of PbCas13b: N486, K484,N480, H452, N455, or K457.
 121. The engineered CRISPR-Cas protein ofclaim 55 comprising one or more mutation of an amino acid correspondingto the following amino acids of PbCas13b: K393, R402, N482, N486, K484,N480, H452, N455, or K457.
 122. The engineered CRISPR-Cas protein ofclaim 55 comprising in the LID domain one or more mutation of an aminoacid corresponding to the following amino acids in the LID domain ofPbCas13b: K393, R402, N482, N486, K484, N480, H452, N455, or K457. 123.The engineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPbCas13b: S658, N653, A656, K655, N652, H567, N455, H500, K871, K857,K870, W842, E873, R877, K846, R874, R762, V795, A796, R791, G566, K590,R638, H452, S757, N756, N486, K484, N480, K457, K741, K393, R402, orN482.
 124. The engineered CRISPR-Cas protein of claim 55 comprising oneor more mutation of an amino acid corresponding to the following aminoacids of PbCas13b: S658, N653, K655, N652, H567, N455, H500, K871, K857,K870, W842, E873, R877, K846, R874, R762, R791, G566, K590, R638, H452,S757, N756, N486, K484, N480, K457, K741, K393, R402, or N482.
 125. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K943, or R1041.
 126. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): R53 or Y164.
 127. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K943 or R1041.
 128. The engineered CRISPR-Casprotein of claim 55 comprising in a HEPN domain one or more mutation ofan amino acid corresponding to the following amino acids in a HEPNdomain of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K943, orR1041.
 129. The engineered CRISPR-Cas protein of claim 55 comprising inHEPN domain 1 one or more mutation of an amino acid corresponding to thefollowing amino acids in HEPN domain 1 of Prevotella buccae Cas13b(PbCas13b): R53 or Y164.
 130. The engineered CRISPR-Cas protein of claim55 comprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943 or R1041.
 131. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, K943, R1041, R56, N157, H161,R1068, N1069, or H1073.
 132. The engineered CRISPR-Cas protein of claim55 comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,R56, N157, or H161.
 133. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K943,R1041, R1068, N1069, or H1073.
 134. The engineered CRISPR-Cas protein ofclaim 55 comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K943, R1041, R56, N157,H161, R1068, N1069, or H1073.
 135. The engineered CRISPR-Cas protein ofclaim 55 comprising in HEPN domain 1 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, R56, N157, or H161. 136.The engineered CRISPR-Cas protein of claim 55 comprising in HEPN domain2 one or more mutation of an amino acid corresponding to the followingamino acids in HEPN domain 2 of Prevotella buccae Cas13b (PbCas13b):K943, R1041, R1068, N1069, or H1073.
 137. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R53, Y164, K183, K193, K943, or R1041.
 138. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, Y164, K183, or K193.
 139. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K943 or R1041.
 140. The engineered CRISPR-Casprotein of claim 55 comprising in a HEPN domain one or more mutation ofan amino acid corresponding to the following amino acids in a HEPNdomain of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193,K943, or R1041.
 141. The engineered CRISPR-Cas protein of claim 55comprising in HEPN domain 1 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 1 ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, or K193.
 142. Theengineered CRISPR-Cas protein of claim 55 comprising in HEPN domain 2one or more mutation of an amino acid corresponding to the followingamino acids in HEPN domain 2 of Prevotella buccae Cas13b (PbCas13b):K943 or R1041.
 143. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K183, K193, K943, R1041, R56, N157, H161, R1068, N1069, or H1073. 144.The engineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, R56, N157,or H161.
 145. The engineered CRISPR-Cas protein of claim 55 comprisingone or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): K943, R1041, R1068,N1069, or H1073.
 146. The engineered CRISPR-Cas protein of claim 55comprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, K943, R1041,R56, N157, H161, R1068, N1069, or H1073.
 147. The engineered CRISPR-Casprotein of claim 55 comprising in HEPN domain 1 one or more mutation ofan amino acid corresponding to the following amino acids in HEPN domain1 of Prevotella buccae Cas13b (PbCas13b): R53, Y164, K183, K193, R56,N157, or H161.
 148. The engineered CRISPR-Cas protein of claim 55comprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofPrevotella buccae Cas13b (PbCas13b): K943, R1041, R1068, N1069, orH1073.
 149. The engineered CRISPR-Cas protein of claim 55 comprising oneor more mutation of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): K183 or K193.
 150. Theengineered CRISPR-Cas protein of claim 55 comprising in HEPN domain 1one or more mutation of an amino acid corresponding to the followingamino acids in HEPN domain 1 of Prevotella buccae Cas13b (PbCas13b):K183 or K193.
 151. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R53, Y164,K943, or R1041.
 152. The engineered CRISPR-Cas protein of claim 55comprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, Y164, K943, or R1041.
 153. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): R53, K943, or R1041; preferablyR53A, R53K, R53D, or R53E; K943A, K943R, K943D, or K943E; or R1041A,R1041K, R1041D, or R1041E.
 154. The engineered CRISPR-Cas protein ofclaim 55 comprising in a HEPN domain one or more mutation of an aminoacid corresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53, K943, or R1041; preferablyR53A, R53K, R53D, or R53E; K943A, K943R, K943D, or K943E; or R1041A,R1041K, R1041D, or R1041E.
 155. The engineered CRISPR-Cas protein ofclaim 55 comprising a mutation of an amino acid corresponding to aminoacid Y164 of Prevotella buccae Cas13b (PbCas13b), preferably Y164A,Y164F, or Y164W.
 156. The engineered CRISPR-Cas protein of claim 55comprising HEPN domain 1 a mutation of an amino acid corresponding toamino acid Y164 HEPN domain 1 of Prevotella buccae Cas13b (PbCas13b),preferably Y164A, Y164F, or Y164W.
 157. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): T405, H407, K457, D434, K431, R402, K393, R482, N480, D396,E397, D398, or E399.
 158. The engineered CRISPR-Cas protein of claim 55comprising in the LID domain one or more mutation of an amino acidcorresponding to the following amino acids in the LID domain ofPrevotella buccae Cas13b (PbCas13b): T405, H407, K457, D434, K431, R402,K393, R482, N480, D396, E397, D398, or E399.
 159. The engineeredCRISPR-Cas protein of claim 55 comprising a mutation of an amino acidcorresponding to amino acid H407 of Prevotella buccae Cas13b (PbCas13b),preferably H407Y, H407W, or H407F.
 160. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R402, K393, R482, N480, D396, E397, D398, or E399.
 161. Theengineered CRISPR-Cas protein of claim 55 comprising in the LID domainone or more mutation of an amino acid corresponding to the followingamino acids in the LID domain of Prevotella buccae Cas13b (PbCas13b):R402, K393, R482, N480, D396, E397, D398, or E399.
 162. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K457, D434, or K431.
 163. The engineeredCRISPR-Cas protein of claim 55 comprising in the LID domain one or moremutation of an amino acid corresponding to the following amino acids inthe LID domain of Prevotella buccae Cas13b (PbCas13b): K457, D434, orK431.
 164. The engineered CRISPR-Cas protein of claim 55 comprising oneor more mutation of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): H500, K570, K590, N634,R638, N652, N653, K655, S658, K741, K744, N756, S757, R762, R791, K846,K857, K870, R877, R600, K607, K612, R614, K617, K826, K828, K829, R824,R830, Q831, K835, K836, R838, R618, Q646, N647, N653, or N652.
 165. Theengineered CRISPR-Cas protein of claim 55 comprising in a helical domainone or more mutation of an amino acid corresponding to the followingamino acids in a helical domain of Prevotella buccae Cas13b (PbCas13b):H500, K570, K590, N634, R638, N652, N653, K655, S658, K741, K744, N756,S757, R762, R791, K846, K857, K870, R877, R600, K607, K612, R614, K617,K826, K828, K829, R824, R830, Q831, K835, K836, R838, R618, Q646, N647,N653, or N652.
 166. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): H500,K570, N756, S757, R762, R791, K846, K857, K870, R877, K826, K828, K829,R824, R830, Q831, K835, K836, or R838.
 167. The engineered CRISPR-Casprotein of claim 55 comprising in helical domain 1 one or more mutationof an amino acid corresponding to the following amino acids in helicaldomain 1 of Prevotella buccae Cas13b (PbCas13b): H500, K570, N756, S757,R762, R791, K846, K857, K870, R877, K826, K828, K829, R824, R830, Q831,K835, K836, or R838.
 168. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): H500,K570, N756, S757, R762, or R791.
 169. The engineered CRISPR-Cas proteinof claim 55 comprising in helical domain 1 one or more mutation of anamino acid corresponding to the following amino acids in helical domain1 of Prevotella buccae Cas13b (PbCas13b): H500, K570, N756, S757, R762,or R791.
 170. The engineered CRISPR-Cas protein of claim 55 comprisingone or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): K846, K857, K870,R877, K826, K828, K829, R824, R830, Q831, K835, K836, or R838.
 171. Theengineered CRISPR-Cas protein of claim 55 comprising in the bridge helixdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in the bridge helix domain of Prevotella buccaeCas13b (PbCas13b): K846, K857, K870, R877, K826, K828, K829, R824, R830,Q831, K835, K836, or R838.
 172. The engineered CRISPR-Cas protein ofclaim 55 comprising one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):H500 or K570.
 173. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 1-2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-2 ofPrevotella buccae Cas13b (PbCas13b): H500 or K570.
 174. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): N756, S757, R762, R791, K846, K857, K870,R877, K826, K828, K829, R824, R830, Q831, K835, K836, or R838.
 175. Theengineered CRISPR-Cas protein of claim 55 comprising in helical domain1-3 one or more mutation of an amino acid corresponding to the followingamino acids in helical domain 1-3 of Prevotella buccae Cas13b(PbCas13b): N756, S757, R762, R791, K846, K857, K870, R877, K826, K828,K829, R824, R830, Q831, K835, K836, or R838.
 176. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): N756, S757, R762, or R791.
 177. The engineeredCRISPR-Cas protein of claim 55 comprising in helical domain 1-3 one ormore mutation of an amino acid corresponding to the following aminoacids in helical domain 1-3 of Prevotella buccae Cas13b (PbCas13b):N756, S757, R762, or R791.
 178. The engineered CRISPR-Cas protein ofclaim 55 comprising one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):N756, S757, R762, R791, K846, K857, K870, or R877.
 179. The engineeredCRISPR-Cas protein of claim 55 comprising in helical domain 1-3 one ormore mutation of an amino acid corresponding to the following aminoacids in helical domain 1-3 of Prevotella buccae Cas13b (PbCas13b):N756, S757, R762, R791, K846, K857, K870, or R877.
 180. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K826, K828, K829, R824, R830, Q831, K835,K836, or R838.
 181. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 1-3 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 1-3 ofPrevotella buccae Cas13b (PbCas13b): K826, K828, K829, R824, R830, Q831,K835, K836, or R838.
 182. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): K590,N634, R638, N652, N653, K655, S658, K741, K744, R600, K607, K612, R614,K617, R618, Q646, N647, N653, or N652.
 183. The engineered CRISPR-Casprotein of claim 55 comprising in helical domain 2 one or more mutationof an amino acid corresponding to the following amino acids in helicaldomain 2 of Prevotella buccae Cas13b (PbCas13b): K590, N634, R638, N652,N653, K655, S658, K741, K744, R600, K607, K612, R614, K617, R618, Q646,N647, N653, or N652.
 184. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): Q646 orN647.
 185. The engineered CRISPR-Cas protein of claim 55 comprising inhelical domain 2 one or more mutation of an amino acid corresponding tothe following amino acids in helical domain 2 of Prevotella buccaeCas13b (PbCas13b): Q646 or N647.
 186. The engineered CRISPR-Cas proteinof claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N653 or N652.
 187. The engineered CRISPR-Cas protein ofclaim 55 comprising in helical domain 2 one or more mutation of an aminoacid corresponding to the following amino acids in helical domain 2 ofPrevotella buccae Cas13b (PbCas13b): N653 or N652.
 188. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K590, N634, R638, N652, N653, K655, S658,K741, or K744.
 189. The engineered CRISPR-Cas protein of claim 55comprising in helical domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in helical domain 2 ofPrevotella buccae Cas13b (PbCas13b): K590, N634, R638, N652, N653, K655,S658, K741, or K744.
 190. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R600,K607, K612, R614, K617, or R618.
 191. The engineered CRISPR-Cas proteinof claim 55 comprising in helical domain 2 one or more mutation of anamino acid corresponding to the following amino acids in helical domain2 of Prevotella buccae Cas13b (PbCas13b): R600, K607, K612, R614, K617,or R618.
 192. The engineered CRISPR-Cas protein of claim 55 comprisingone or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): R285, R287, K292,E296, N297, or K294.
 193. The engineered CRISPR-Cas protein of claim 55comprising in the IDL domain one or more mutation of an amino acidcorresponding to the following amino acids in the IDL domain ofPrevotella buccae Cas13b (PbCas13b): R285, R287, K292, E296, N297, orK294.
 194. The engineered CRISPR-Cas protein of claim 55 comprising oneor more mutation of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): R285, K292, E296, or N297.195. The engineered CRISPR-Cas protein of claim 55 comprising in the IDLdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in the IDL domain of Prevotella buccae Cas13b(PbCas13b): R285, K292, E296, or N297.
 196. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): T405, H500, K570, K590, N634, R638, N652, N653, K655, S658,K741, K744, N756, S757, R762, R791, K846, K857, K870, R877, K183, K193,R600, K607, K612, R614, K617, K826, K828, K829, R824, R830, Q831, K835,K836, R838, R618, D434, K431, R285, R287, K292, E296, N297, Q646, N647,or K294.
 197. The engineered CRISPR-Cas protein of claim 55 comprisingone or more mutation of an amino acid corresponding to the followingamino acids of Prevotella buccae Cas13b (PbCas13b): R402, K393, N653,N652, R482, N480, D396, E397, D398, or E399.
 198. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53, K655, R762, or R1041; preferably R53A orR53D; K655A; R762A; or R1041E or R1041D.
 199. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): N297, E296, K292, or R285; preferably N297A, E296A, K292A,or R285A.
 200. The engineered CRISPR-Cas protein of claim 55 comprisingin (the central channel of) the IDL domain one or more mutation of anamino acid corresponding to the following amino acids in (the centralchannel of) the IDL domain of Prevotella buccae Cas13b (PbCas13b): N297,E296, K292, or R285; preferably N297A, E296A, K292A, or R285A.
 201. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPrevotella buccae Cas13b (PbCas13b): Q831, K836, R838, N652, N653, R830,K655 or R762; preferably Q831A, K836A, R838A, N652A, N653A, R830A,K655A, or R762A.
 202. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): N652,N653, R830, K655 or R762; preferably N652A, N653A, R830A, K655A, orR762A.
 203. The engineered CRISPR-Cas protein of claim 55 comprising oneor more mutation of an amino acid corresponding to the following aminoacids of Prevotella buccae Cas13b (PbCas13b): K655 or R762; preferablyK655A or R762A.
 204. The engineered CRISPR-Cas protein of claim 55comprising in a helical domain one or more mutation of an amino acidcorresponding to the following amino acids in a helical domain ofPrevotella buccae Cas13b (PbCas13b): Q831, K836, R838, N652, N653, R830,K655 or R762; preferably Q831A, K836A, R838A, N652A, N653A, R830A,K655A, or R762A.
 205. The engineered CRISPR-Cas protein of claim 55comprising a helical domain one or more mutation of an amino acidcorresponding to the following amino acids a helical domain ofPrevotella buccae Cas13b (PbCas13b): N652, N653, R830, K655 or R762;preferably N652A, N653A, R830A, K655A, or R762A.
 206. The engineeredCRISPR-Cas protein of claim 55 comprising in helical domain 2 one ormore mutation of an amino acid corresponding to the following aminoacids in helical domain 2 of Prevotella buccae Cas13b (PbCas13b): K655or R762; preferably K655A or R762A.
 207. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R614, K607, K193, K183 or R600; preferably R614A, K607A,K193A, K183A or R600A.
 208. The engineered CRISPR-Cas protein of claim55 comprising in the trans-subunit loop of helical domain 2 one or moremutation of an amino acid corresponding to the following amino acids inthe trans-subunit loop of helical domain 2 of Prevotella buccae Cas13b(PbCas13b): Q646 or N647; preferably Q646A or N647A.
 209. The engineeredCRISPR-Cas protein of claim 55 comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): R53 or R1041; preferably R53A or R53D, orR1041E or R1041D.
 210. The engineered CRISPR-Cas protein of claim 55comprising in a HEPN domain one or more mutation of an amino acidcorresponding to the following amino acids in a HEPN domain ofPrevotella buccae Cas13b (PbCas13b): R53 or R1041; preferably R53A orR53D, or R1041E or R1041D.
 211. The engineered CRISPR-Cas protein ofclaim 55 comprising one or more mutation of an amino acid correspondingto the following amino acids of Prevotella buccae Cas13b (PbCas13b):K457, D397, E398, D399, E400, T405, H407 or D434; preferably D397A,E398A, D399A, E400A, T405A, H407A, H407W, H407Y, H407F or D434A. 212.The engineered CRISPR-Cas protein of claim 55 comprising in the LIDdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in the LID domain of Prevotella buccae Cas13b(PbCas13b): K457, D397, E398, D399, E400, T405, H407 or D434; preferablyD397A, E398A, D399A, E400A, T405A, H407A, H407W, H407Y, H407F or D434A.213. The engineered CRISPR-Cas protein of claim 55, wherein the aminoacids correspond to the following amino acids of Prevotella buccaeCas13b (PbCas13b): amino acids 46-57, 73-79, 152-164, 1036-1046, and1064-1074.
 214. The engineered CRISPR-Cas protein of claim 55,comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): R156,N157, H161, R1068, N1069, and H1073.
 215. The engineered CRISPR-Casprotein of claim 55, comprising one or more mutation of an amino acidcorresponding to the following amino acids of Prevotella buccae Cas13b(PbCas13b): R285, R287, K292, K294, E296, and N297.
 216. The engineeredCRISPR-Cas protein of claim 55, comprising one or more mutation of anamino acid corresponding to the following amino acids of Prevotellabuccae Cas13b (PbCas13b): K826, K828, K829, R824, R830, Q831, K835,K836, and R838.
 217. The engineered CRISPR-Cas protein of claim 55,comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella buccae Cas13b (PbCas13b): T405,H407, K457, H500, K570, K590, N634, R638, N652, N653, K655, S658, K741,K744, N756, S757, R762, R791, K846, K857, K870, and R877.
 218. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid T405 of Prevotella buccae Cas13b(PbCas13b).
 219. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid H407of Prevotella buccae Cas13b (PbCas13b).
 220. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K457 of Prevotella buccae Cas13b (PbCas13b).
 221. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid H500 of Prevotella buccae Cas13b(PbCas13b).
 222. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K570of Prevotella buccae Cas13b (PbCas13b).
 223. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K590 of Prevotella buccae Cas13b (PbCas13b).
 224. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid N634 of Prevotella buccae Cas13b(PbCas13b).
 225. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R638of Prevotella buccae Cas13b (PbCas13b).
 226. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid N652 of Prevotella buccae Cas13b (PbCas13b).
 227. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid N653 of Prevotella buccae Cas13b(PbCas13b).
 228. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K655of Prevotella buccae Cas13b (PbCas13b).
 229. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid S658 of Prevotella buccae Cas13b (PbCas13b).
 230. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid K741 of Prevotella buccae Cas13b(PbCas13b).
 231. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K744of Prevotella buccae Cas13b (PbCas13b).
 232. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid N756 of Prevotella buccae Cas13b (PbCas13b).
 233. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid S757 of Prevotella buccae Cas13b(PbCas13b).
 234. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R762of Prevotella buccae Cas13b (PbCas13b).
 235. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid R791 of Prevotella buccae Cas13b (PbCas13b).
 236. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid K846 of Prevotella buccae Cas13b(PbCas13b).
 237. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K857of Prevotella buccae Cas13b (PbCas13b).
 238. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K870 of Prevotella buccae Cas13b (PbCas13b).
 239. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid R877 of Prevotella buccae Cas13b(PbCas13b).
 240. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K183of Prevotella buccae Cas13b (PbCas13b).
 241. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K193 of Prevotella buccae Cas13b (PbCas13b).
 242. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid R600 of Prevotella buccae Cas13b(PbCas13b).
 243. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K607of Prevotella buccae Cas13b (PbCas13b).
 244. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K612 of Prevotella buccae Cas13b (PbCas13b).
 245. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid R614 of Prevotella buccae Cas13b(PbCas13b).
 246. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K617of Prevotella buccae Cas13b (PbCas13b).
 247. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K826 of Prevotella buccae Cas13b (PbCas13b).
 248. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid K828 of Prevotella buccae Cas13b(PbCas13b).
 249. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K829of Prevotella buccae Cas13b (PbCas13b).
 250. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid R824 of Prevotella buccae Cas13b (PbCas13b).
 251. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid R830 of Prevotella buccae Cas13b(PbCas13b).
 252. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid Q831of Prevotella buccae Cas13b (PbCas13b).
 253. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K835 of Prevotella buccae Cas13b (PbCas13b).
 254. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid K836 of Prevotella buccae Cas13b(PbCas13b).
 255. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R838of Prevotella buccae Cas13b (PbCas13b).
 256. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid R618 of Prevotella buccae Cas13b (PbCas13b).
 257. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid D434 of Prevotella buccae Cas13b(PbCas13b).
 258. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid K431of Prevotella buccae Cas13b (PbCas13b).
 259. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid R53 of Prevotella buccae Cas13b (PbCas13b).
 260. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid K943 of Prevotella buccae Cas13b(PbCas13b).
 261. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R1041of Prevotella buccae Cas13b (PbCas13b).
 262. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid Y164 of Prevotella buccae Cas13b (PbCas13b).
 263. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid R285 of Prevotella buccae Cas13b(PbCas13b).
 264. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R287of Prevotella buccae Cas13b (PbCas13b).
 265. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K292 of Prevotella buccae Cas13b (PbCas13b).
 266. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid E296 of Prevotella buccae Cas13b(PbCas13b).
 267. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid N297of Prevotella buccae Cas13b (PbCas13b).
 268. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid Q646 of Prevotella buccae Cas13b (PbCas13b).
 269. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid N647 of Prevotella buccae Cas13b(PbCas13b).
 270. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R402of Prevotella buccae Cas13b (PbCas13b).
 271. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K393 of Prevotella buccae Cas13b (PbCas13b).
 272. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid N653 of Prevotella buccae Cas13b(PbCas13b).
 273. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid N652of Prevotella buccae Cas13b (PbCas13b).
 274. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid R482 of Prevotella buccae Cas13b (PbCas13b).
 275. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid N480 of Prevotella buccae Cas13b(PbCas13b).
 276. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid D396of Prevotella buccae Cas13b (PbCas13b).
 277. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid E397 of Prevotella buccae Cas13b (PbCas13b).
 278. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid D398 of Prevotella buccae Cas13b(PbCas13b).
 279. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid E399of Prevotella buccae Cas13b (PbCas13b).
 280. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K294 of Prevotella buccae Cas13b (PbCas13b).
 281. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid E400 of Prevotella buccae Cas13b(PbCas13b).
 282. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R56of Prevotella buccae Cas13b (PbCas13b).
 283. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid N157 of Prevotella buccae Cas13b (PbCas13b).
 284. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid H161 of Prevotella buccae Cas13b(PbCas13b).
 285. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid H452of Prevotella buccae Cas13b (PbCas13b).
 286. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid N455 of Prevotella buccae Cas13b (PbCas13b).
 287. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid K484 of Prevotella buccae Cas13b(PbCas13b).
 288. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid N486of Prevotella buccae Cas13b (PbCas13b).
 289. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid G566 of Prevotella buccae Cas13b (PbCas13b).
 290. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid H567 of Prevotella buccae Cas13b(PbCas13b).
 291. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid A656of Prevotella buccae Cas13b (PbCas13b).
 292. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid V795 of Prevotella buccae Cas13b (PbCas13b).
 293. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid A796 of Prevotella buccae Cas13b(PbCas13b).
 294. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid W842of Prevotella buccae Cas13b (PbCas13b).
 295. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid K871 of Prevotella buccae Cas13b (PbCas13b).
 296. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid E873 of Prevotella buccae Cas13b(PbCas13b).
 297. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid R874of Prevotella buccae Cas13b (PbCas13b).
 298. The engineered CRISPR-Casprotein of claim 55 comprising a mutation of an amino acid correspondingto amino acid R1068 of Prevotella buccae Cas13b (PbCas13b).
 299. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid N1069 of Prevotella buccae Cas13b(PbCas13b).
 300. The engineered CRISPR-Cas protein of claim 55comprising a mutation of an amino acid corresponding to amino acid H1073of Prevotella buccae Cas13b (PbCas13b).
 301. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Leptotrichia shahii Cas13a(LshCas13a): R597, N598, H602, R1278, N1279, or H1283.
 302. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofLeptotrichia shahii Cas13a (LshCas13a): R597, N598, H602, R1278, N1279,or H1283.
 303. The engineered CRISPR-Cas protein of claim 55 comprisingin a HEPN domain one or more mutation of an amino acid corresponding tothe following amino acids in a HEPN domain of Leptotrichia shahii Cas13a(LshCas13a): R597, N598, H602, R1278, N1279, or H1283.
 304. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofLeptotrichia shahii Cas13a (LshCas13a): R597, N598, or H602.
 305. Theengineered CRISPR-Cas protein of claim 55, comprising in HEPN domain 1one or more mutation of an amino acid corresponding to the followingamino acids in HEPN domain 1 of Leptotrichia shahii Cas13a (LshCas13a):R597, N598, or H602.
 306. The engineered CRISPR-Cas protein of claim 55,comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Leptotrichia shahii Cas13a (LshCas13a): R1278,N1279, or H1283.
 307. The engineered CRISPR-Cas protein of claim 55,comprising in HEPN domain 2 one or more mutation of an amino acidcorresponding to the following amino acids in HEPN domain 2 ofLeptotrichia shahii Cas13a (LshCas13a): R1278, N1279, or H1283.
 308. Theengineered CRISPR-Cas protein of claim 55, comprising one or moremutation of an amino acid corresponding to the following amino acids ofPorphyromonas gulae Cas13b (PguCas13b): R146, H151, R1116, or H1121.309. The engineered CRISPR-Cas protein of claim 55 comprising one ormore mutation of an amino acid corresponding to the following aminoacids of Porphyromonas gulae Cas13b (PguCas13b): R146, H151, R1116, orH1121.
 310. The engineered CRISPR-Cas protein of claim 55 comprising ina HEPN domain one or more mutation of an amino acid corresponding to thefollowing amino acids in a HEPN domain of Porphyromonas gulae Cas13b(PguCas13b): R146, H151, R1116, or H1121.
 311. The engineered CRISPR-Casprotein of claim 55 comprising one or more mutation of an amino acidcorresponding to the following amino acids of Porphyromonas gulae Cas13b(PguCas13b): R146 or H151.
 312. The engineered CRISPR-Cas protein ofclaim 55 comprising in HEPN domain 1 one or more mutation of an aminoacid corresponding to the following amino acids in HEPN domain 1 ofPorphyromonas gulae Cas13b (PguCas13b): R146 or H151.
 313. Theengineered CRISPR-Cas protein of claim 55 comprising one or moremutation of an amino acid corresponding to the following amino acids ofPorphyromonas gulae Cas13b (PguCas13b): R1116 or H1121.
 314. Theengineered CRISPR-Cas protein of claim 55 comprising in HEPN domain 2one or more mutation of an amino acid corresponding to the followingamino acids in HEPN domain 2 of Porphyromonas gulae Cas13b (PguCas13b):R1116 or H1121.
 315. The engineered CRISPR-Cas protein of claim 55comprising one or more mutation of an amino acid corresponding to thefollowing amino acids of Prevotella sp. P5-125 Cas13b (PspCas13b): H133or H1058.
 316. The engineered CRISPR-Cas protein of claim 55 comprisingone or more mutation of an amino acid corresponding to the followingamino acids of Prevotella sp. P5-125 Cas13b (PspCas13b): H133 or H1058.317. The engineered CRISPR-Cas protein of claim 55 comprising in a HEPNdomain one or more mutation of an amino acid corresponding to thefollowing amino acids in a HEPN domain of Prevotella sp. P5-125 Cas13b(PspCas13b): H133 or H1058.
 318. The engineered CRISPR-Cas protein ofclaim 55 comprising a mutation of an amino acid corresponding to aminoacid H133 of Prevotella sp. P5-125 Cas13b (PspCas13b).
 319. Theengineered CRISPR-Cas protein of claim 55 comprising in HEPN domain 1 amutation of an amino acid corresponding to amino acid H133 in HEPNdomain 1 of Prevotella sp. P5-125 Cas13b (PspCas13b).
 320. Theengineered CRISPR-Cas protein of claim 55 comprising a mutation of anamino acid corresponding to amino acid H1058 of Prevotella sp. P5-125Cas13b (PspCas13b).
 321. The engineered CRISPR-Cas protein of claim 55comprising in HEPN domain 2 a mutation of an amino acid corresponding tothe amino acid H1058 in HEPN domain 2 of Prevotella sp. P5-125 Cas13b(PspCas13b).
 322. The engineered CRISPR-Cas protein of any of claims 57to 321, wherein said amino acid is mutated to A, P, or V, preferably A.323. The engineered CRISPR-Cas protein of any of claims 57 to 321,wherein said amino acid is mutated to a hydrophobic amino acid.
 324. Theengineered CRISPR-Cas protein of any of claims 57 to 321, wherein saidamino acid is mutated to an aromatic amino acid.
 325. The engineeredCRISPR-Cas protein of any of claims 57 to 321, wherein said amino acidis mutated to a charged amino acid.
 326. The engineered CRISPR-Casprotein of any of claims 57 to 321, wherein said amino acid is mutatedto a positively charged amino acid.
 327. The engineered CRISPR-Casprotein of any of claims 57 to 321, wherein said amino acid is mutatedto a negatively charged amino acid.
 328. The engineered CRISPR-Casprotein of any of claims 57 to 321, wherein said amino acid is mutatedto a polar amino acid.
 329. The engineered CRISPR-Cas protein of any ofclaims 57 to 321, wherein said amino acid is mutated to an aliphaticamino acid.
 330. The engineered CRISPR-Cas protein of claim 55, whereinsaid Cas13 protein is or originates from a species of the genusAlistipes, Anaerosalibacter, Bacteroides, Bacteroidetes, Bergeyella,Blautia, Butyrivibrio, Capnocytophaga, Carnobacterium, Chloroflexus,Chryseobacterium, Clostridium, Demequina, Eubacteriaceae, Eubacterium,Flavobacterium, Fusobacterium, Herbinix, Insolitispirillum,Lachnospiraceae, Leptotrichia, Listeria, Myroides, Paludibacter,Phaeodactylibacter, Porphyromonadaceae, Porphyromonas, Prevotella,Pseudobutyrivibrio, Psychroflexus, Reichenbachiella, Rhodobacter,Riemerella, Sinomicrobium, Thalassospira, Ruminococcus; preferablyLeptotrichia shahii, Listeria seeligeri, Lachnospiraceae bacterium (suchas Lb MA2020, Lb NK4A179, Lb NK4A144), Clostridium aminophilum (such asCa DSM 10710), Carnobacterium gallinarum (such as Cg DSM 4847),Paludibacter propionicigenes (such as Pp WB4), Listeriaweihenstephanensis (such as Lw FSL R9-0317), Listeriaceae bacterium(such as Lb FSL M6-0635), Leptotrichia wadei (such as Lw F0279),Rhodobacter capsulatus (such as Rc SB 1003, Rc R121, Rc DE442),Leptotrichia buccalis (such as Lb C-1013-b), Herbinixhemicellulosilytica, Eubacteriaceae bacterium (such as Eb CHKCI004),Blautia. sp Marseille-P2398, Leptotrichia sp. oral taxon 879 str. F0557,Chloroflexus aggregans, Demequina aurantiaca, Thalassospira sp. TSL5-1,Pseudobutyrivibrio sp. OR37, Butyrivibrio sp. YAB3001, Leptotrichia sp.Marseille-P3007, Bacteroides ihuae, Porphyromonadaceae bacterium (suchas Pb KH3CP3RA), Listeria riparia, Insolitispirillum peregrinum,Alistipes sp. ZOR0009, Bacteroides pyogenes (such as Bp F0041),Bacteroidetes bacterium (such as Bb GWA2_31_9), Bergeyella zoohelcum(such as Bz ATCC 43767), Capnocytophaga canimorsus, Capnocytophagacynodegmi, Chryseobacterium carnipullorum, Chryseobacterium jejuense,Chryseobacterium ureilyticum, Flavobacterium branchiophilum,Flavobacterium columnare, Flavobacterium sp. 316, Myroides odoratimimus(such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG 3837), Paludibacterpropionicigenes, Phaeodactylibacter xiamenensis, Porphyromonasgingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, Pg W4087,Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60, Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, Sinomicrobium oceani,Fusobacterium necrophorum (such as Fn subsp. funduliforme ATCC 51357, FnDJ-2, Fn BFTR-1, Fn subsp. funduliforme), Fusobacterium perfoetens (suchas Fp ATCC 29250), Fusobacterium ulcerans (such as Fu ATCC 49185),Anaerosalibacter sp. ND1, Eubacterium siraeum, Ruminococcus flavefaciens(such as Rfx XPD3002), or Ruminococcus albus.
 331. The engineeredCRISPR-Cas protein of claim 55, wherein said Cas13 protein is a Cas13aprotein.
 332. The engineered CRISPR-Cas protein of claim 331, whereinsaid Cas13a protein is or originates from a species of the genusBacteroides, Blautia, Butyrivibrio, Carnobacterium, Chloroflexus,Clostridium, Demequina, Eubacterium, Herbinix, Insolitispirillum,Lachnospiraceae, Leptotrichia, Listeria, Paludibacter,Porphyromonadaceae, Pseudobutyrivibrio, Rhodobacter, or Thalassospira;preferably Leptotrichia shahii, Listeria seeligeri, Lachnospiraceaebacterium (such as Lb MA2020, Lb NK4A179, Lb NK4A144), Clostridiumaminophilum (such as Ca DSM 10710), Carnobacterium gallinarum (such asCg DSM 4847), Paludibacter propionicigenes (such as Pp WB4), Listeriaweihenstephanensis (such as Lw FSL R9-0317), Listeriaceae bacterium(such as Lb FSL M6-0635), Leptotrichia wadei (such as Lw F0279),Rhodobacter capsulatus (such as Rc SB 1003, Rc R121, Rc DE442),Leptotrichia buccalis (such as Lb C-1013-b), Herbinixhemicellulosilytica, Eubacteriaceae bacterium (such as Eb CHKCI004),Blautia. sp Marseille-P2398, Leptotrichia sp. oral taxon 879 str. F0557,Chloroflexus aggregans, Demequina aurantiaca, Thalassospira sp. TSL5-1,Pseudobutyrivibrio sp. OR37, Butyrivibrio sp. YAB3001, Leptotrichia sp.Marseille-P3007, Bacteroides ihuae, Porphyromonadaceae bacterium (suchas Pb KH3CP3RA), Listeria riparia, or Insolitispirillum peregrinum. 333.The engineered CRISPR-Cas protein of claim 55, wherein said Cas13protein is a Cas13b protein.
 334. The engineered CRISPR-Cas protein ofclaim 333, wherein said Cas13b protein is or originates from a speciesof the genus Alistipes, Bacteroides, Bacteroidetes, Bergeyella,Capnocytophaga, Chryseobacterium, Flavobacterium, Myroides,Paludibacter, Phaeodactylibacter, Porphyromonas, Prevotella,Psychroflexus, Reichenbachiella, Riemerella, or Sinomicrobium;preferably Alistipes sp. ZOR0009, Bacteroides pyogenes (such as BpF0041), Bacteroidetes bacterium (such as Bb GWA2_31_9), Bergeyellazoohelcum (such as Bz ATCC 43767), Capnocytophaga canimorsus,Capnocytophaga cynodegmi, Chryseobacterium carnipullorum,Chryseobacterium jejuense, Chryseobacterium ureilyticum, Flavobacteriumbranchiophilum, Flavobacterium columnare, Flavobacterium sp. 316,Myroides odoratimimus (such as Mo CCUG 10230, Mo CCUG 12901, Mo CCUG3837), Paludibacter propionicigenes, Phaeodactylibacter xiamenensis,Porphyromonas gingivalis (such as Pg F0185, Pg F0568, Pg JCVI SC001, PgW4087, Porphyromonas gulae, Porphyromonas sp. COT-052 OH4946, Prevotellaaurantiaca, Prevotella buccae (such as Pb ATCC 33574), Prevotellafalsenii, Prevotella intermedia (such as Pi 17, Pi ZT), Prevotellapallens (such as Pp ATCC 700821), Prevotella pleuritidis, Prevotellasaccharolytica (such as Ps F0055), Prevotella sp. MA2016, Prevotella sp.MSX73, Prevotella sp. P4-76, Prevotella sp. P5-119, Prevotella sp.P5-125, Prevotella sp. P5-60, Psychroflexus torquis, Reichenbachiellaagariperforans, Riemerella anatipestifer, or Sinomicrobium oceani. 335.The engineered CRISPR-Cas protein of claim 55, wherein said Cas13protein is a Cas13c protein.
 336. The engineered CRISPR-Cas protein ofclaim 335, wherein said Cas13c protein is or originates from a speciesof the genus Fusobacterium or Anaerosalibacter; preferably Fusobacteriumnecrophorum (such as Fn subsp. funduliforme ATCC 51357, Fn DJ-2, FnBFTR-1, Fn subsp. funduliforme), Fusobacterium perfoetens (such as FpATCC 29250), Fusobacterium ulcerans (such as Fu ATCC 49185), orAnaerosalibacter sp. ND1.
 337. The engineered CRISPR-Cas protein ofclaim 55, wherein said Cas13 protein is a Cas13d protein.
 338. Theengineered CRISPR-Cas protein of claim 337, wherein said Cas13d proteinis originates from a species of the genus Eubacterium or Ruminococcus,preferably Eubacterium siraeum, Ruminococcus flavefaciens (such as RfxXPD3002), or Ruminococcus albus.
 339. The engineered CRISPR-Cas proteinof claim 50, wherein catalytic activity of the engineered CRISPR-Casprotein is increased as compared to a corresponding wildtype CRISPR-Casprotein.
 340. The engineered CRISPR-Cas protein of claim 50, whereincatalytic activity of the engineered CRISPR-Cas protein is decreased ascompared to a corresponding wildtype CRISPR-Cas protein.
 341. Theengineered CRISPR-Cas protein of claim 50, wherein gRNA binding of theengineered CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein.
 342. The engineeredCRISPR-Cas protein of claim 50, wherein gRNA binding of the engineeredCRISPR-Cas protein is decreased as compared to a corresponding wildtypeCRISPR-Cas protein.
 343. The engineered CRISPR-Cas protein of claim 50,wherein specificity of the CRISPR-Cas protein is increased as comparedto a corresponding wildtype CRISPR-Cas protein.
 344. The engineeredCRISPR-Cas protein of claim 50, wherein specificity of the CRISPR-Casprotein is decreased as compared to a corresponding wildtype CRISPR-Casprotein.
 345. The engineered CRISPR-Cas protein of claim 50, whereinstability of the CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein.
 346. The engineeredCRISPR-Cas protein of claim 50, wherein stability of the CRISPR-Casprotein is decreased as compared to a corresponding wildtype CRISPR-Casprotein.
 347. The engineered CRISPR-Cas protein of claim 50, furthercomprising one or more mutations which inactivate catalytic activity.348. The engineered CRISPR-Cas protein of claim 50, wherein off-targetbinding of the CRISPR-Cas protein is increased as compared to acorresponding wildtype CRISPR-Cas protein.
 349. The engineeredCRISPR-Cas protein of claim 50, wherein off-target binding of theCRISPR-Cas protein is decreased as compared to a corresponding wildtypeCRISPR-Cas protein.
 350. The engineered CRISPR-Cas protein of claim 50,wherein target binding of the CRISPR-Cas protein is increased ascompared to a corresponding wildtype CRISPR-Cas protein.
 351. Theengineered CRISPR-Cas protein of claim 50, wherein target binding of theCRISPR-Cas protein is decreased as compared to a corresponding wildtypeCRISPR-Cas protein.
 352. The engineered CRISPR-Cas protein of claim 50,wherein the engineered CRISPR-Cas protein has a higher protease activityor polynucleotide-binding capability compared to a correspondingwildtype CRISPR-Cas protein.
 353. The engineered CRISPR-Cas protein ofclaim 50, wherein PFS recognition is altered as compared to acorresponding wildtype CRISPR-Cas protein.
 354. The engineeredCRISPR-Cas protein of claim 1, further comprising a functionalheterologous domain.
 355. The engineered CRISPR-Cas protein of claim 50,further comprising an NLS.
 356. The engineered CRISPR-Cas protein ofclaim 50, further comprising a NES.
 357. An engineered CRISPR-Casprotein comprising one or more HEPN domains and is less than 1000 aminoacids in length.
 358. The engineered CRISPR-Cas protein of claim 357,wherein the protein is less than 950, less than 900, less than 850, lessthan 800, less, or than 750 amino acids in size.
 359. The engineeredCRISPR-Cas protein of claim 357, wherein the HEPN domain comprises aRxxxxH motif.
 360. The engineered CRISPR-Cas protein of claim 359,wherein the RxxxxH motif comprises a R[N/H/K]X₁X₂X₃H sequence.
 361. Theengineered CRISPR-Cas protein of claim 360, wherein: X₁ is R, S, D, E,Q, N, G, or Y, X₂ is independently I, S, T, V, or L, and X₃ isindependently L, F, N, Y, V, I, S, D, E, or A.
 362. The engineeredCRISPR-Cas protein of claim 357, wherein the CRISPR-Cas protein is aType VI CRISPR Cas protein.
 363. The engineered CRISPR Cas protein ofclaim 362, wherein the Type VI CRISPR Cas protein is a Cas13a, a Cas13b,a Cas13c, or a Cas13d.
 364. The engineered CRISPR-Cas protein of claim357, wherein the CRISPR-Cas protein is associated with a functionaldomain.
 365. The engineered CRISPR-Cas protein of claim 357, wherein theCRISPR-Cas protein comprises one or more mutations equivalent tomutations in any one of claims 57-329.
 366. The engineered CRISPR-Casprotein of claim 365, wherein the CRISPR-Cas protein comprises one ormore mutations in the helical domain.
 367. The engineered CRISPR-Casprotein of claim 357, wherein the CRISPR-Cas protein is in a dead formor has nickase activity.
 368. A polynucleotide encoding the engineeredCRISPR-Cas protein of any of claims 1 to
 367. 369. The polynucleotideaccording to claim 319, which is codon optimized.
 370. A CRISPR-Cassystem comprising the engineered CRISPR-Cas protein of any of claims 1to 367 or the polynucleotide of claim 318 or 319, and a nucleotidecomponent capable of forming a complex with the engineered CRISPR-Casprotein and able to hybridize with a target nucleic acid sequence anddirect sequence-specific binding of said complex to the target nucleicacid sequence.
 371. A vector system comprising one or more vectors, theone or more vectors comprising one or more polynucleotide moleculesencoding components of the engineered CRISPR-Cas protein of claim 370.372. A method of modifying a target nucleic acid comprising: introducingin a cell or organism that comprises the target nucleic acid, theengineered CRISPR-Cas protein according to any of claims 1 to 367, thepolynucleic acid according to claim 368 or 369, the CRISPR-Cas systemaccording to claim 370, or the vector or vector system according toclaim 371, such that the engineered CRISPR-Cas protein modifies thetarget nucleic acid in the cell or organism.
 373. The method of claim372, wherein the engineered CRISPR-Cas system is introduced via deliveryby liposomes, nanoparticles, exosomes, microvesicles, nucleic acidnanoassemblies, a gene gun, an implantable device, or the vector systemof claim
 371. 374. The method of claim 372, wherein the engineeredCRISPR-cas protein is associated with one or more functional domains.375. The method of claim 372, wherein the target nucleic acid comprisesa genomic locus, and the engineered CRISPR-Cas protein modifies geneproduct encoded at the genomic locus or expression of the gene product.376. The method of claim 372, wherein the target nucleic acid is DNA orRNA and wherein one or more nucleotides in the target nucleic acid arebase edited.
 377. The method of claim 372, wherein the target nucleicacid is DNA or RNA and wherein the target nucleic acid is cleaved. 378.The method of claim 377, wherein the engineered CRISPR-Cas proteinfurther cleaves non-target nucleic acid.
 379. The method of claim 377,further comprising visualizing activity and, optionally, using adetectable label.
 380. The method of claim 377, further comprisingdetecting binding of one or more components of the CRISPR-Cas system tothe target nucleic acid.
 381. The method of claim 377, wherein said cellor organisms is a eukaryotic cell or organism.
 382. The method of claim377, wherein said cell or organisms is an animal cell or organism. 383.The method of claim 377, wherein said cell or organisms is a plant cellor organism.
 384. A method for detecting a target nucleic acid in asample comprising: contacting a sample with: an engineered CRISPR-Casprotein of any one of claims 50 to 367; at least one guidepolynucleotide comprising a guide sequence capable of binding to thetarget nucleic acid and designed to form a complex with the engineeredCRISPR-Cas; and a RNA-based masking construct comprising a non-targetsequence; wherein the engineered CRISPR-Cas protein exhibits collateralRNase activity and cleaves the non-target sequence of the detectionconstruct; and detecting a signal from cleavage of the non-targetsequence, thereby detecting the target nucleic acid in the sample. 385.The method of claim 384, further comprising contacting the sample withreagents for amplifying the target nucleic acid.
 386. The method ofclaim 385, wherein the reagents for amplifying comprises isothermalamplification reaction reagents.
 387. The method of claim 386, whereinthe isothermal amplification reagents comprise nucleic-acidsequence-based amplification, recombinase polymerase amplification,loop-mediated isothermal amplification, strand displacementamplification, helicase-dependent amplification, or nicking enzymeamplification reagents.
 388. The method of claim 384, wherein the targetnucleic acid is DNA molecule and the method further comprises contactingthe target DNA molecule with a primer comprising an RNA polymerase siteand RNA polymerase.
 389. The method of claim 384, wherein the maskingconstruct: suppresses generation of a detectable positive signal untilthe masking construct cleaved or deactivated, or masks a detectablepositive signal or generates a detectable negative signal until themasking construct cleaved or deactivated.
 390. The method of claim 384,wherein the masking construct comprises: a. a silencing RNA thatsuppresses generation of a gene product encoded by a reportingconstruct, wherein the gene product generates the detectable positivesignal when expressed; b. a ribozyme that generates the negativedetectable signal, and wherein the positive detectable signal isgenerated when the ribozyme is deactivated; c. a ribozyme that convertsa substrate to a first color and wherein the substrate converts to asecond color when the ribozyme is deactivated; d. an aptamer and/orcomprises a polynucleotide-tethered inhibitor; e. a polynucleotide towhich a detectable ligand and a masking component are attached; f. ananoparticle held in aggregate by bridge molecules, wherein at least aportion of the bridge molecules comprises a polynucleotide, and whereinthe solution undergoes a color shift when the nanoparticle is disbursedin solution; g. a quantum dot or fluorophore linked to one or morequencher molecules by a linking molecule, wherein at least a portion ofthe linking molecule comprises a polynucleotide; h. a polynucleotide incomplex with an intercalating agent, wherein the intercalating agentchanges absorbance upon cleavage of the polynucleotide; or l. twofluorophores tethered by a polynucleotide that undergo a shift influorescence when released from the polynucleotide.
 391. The method ofclaim 390, wherein the aptamer a. comprises a polynucleotide-tetheredinhibitor that sequesters an enzyme, wherein the enzyme generates adetectable signal upon release from the aptamer orpolynucleotide-tethered inhibitor by acting upon a substrate; or b. isan inhibitory aptamer that inhibits an enzyme and prevents the enzymefrom catalyzing generation of a detectable signal from a substrate orwherein the polynucleotide-tethered inhibitor inhibits an enzyme andprevents the enzyme from catalyzing generation of a detectable signalfrom a substrate; or c. sequesters a pair of agents that when releasedfrom the aptamers combine to generate a detectable signal.
 392. Themethod of claim 390, wherein the nanoparticle is a colloidal metal. 393.The method of claim 384, wherein the at least one guide polynucleotidecomprises a mismatch.
 394. The method of claim 384, wherein the mismatchis up- or downstream of a single nucleotide variation on the one or moreguide sequences.
 395. A cell or organism comprising the engineeredCRISPR-Cas protein according to any of claims 1 to 367, the polynucleicacid according to claim 368 or 369, the CRISPR-Cas system according toclaim 370, or the vector or vector system according to claim 371.